Kernel Choice and Classifiability for RKHS Embeddings of Probability 
Distributions

Sriperumbudur, BK; Fukumizu, K; Gretton, A; Lanckriet, GRG; Schölkopf, B; Bengio,; Y.,; Schuurmans, D.; Lafferty, J.; Williams, C.; Culotta, A.

アイテム詳細

登録内容を編集ファイル形式で保存

一時保存へ追加

このアイテムの新しいバージョンが利用可能です:
https://pure.mpg.de/pubman/item/item_1788861_2

詳細要約

公開

会議論文

Kernel Choice and Classifiability for RKHS Embeddings of Probability Distributions

MPS-Authors

/persons/resource/persons84233

Sriperumbudur, BK
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Dept. Empirical Inference, Max Planck Institute for Intelligent System, Max Planck Society;

/persons/resource/persons83923

Fukumizu, K
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons83946

Gretton, A
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons84193

Schölkopf, B
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;

External Resource

There are no locators available

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

フルテキスト (公開)

公開されているフルテキストはありません

付随資料 (公開)

There is no public supplementary material available

引用

Sriperumbudur, B., Fukumizu, K., Gretton, A., Lanckriet, G., & Schölkopf, B. (2010). Kernel Choice and Classifiability for RKHS Embeddings of Probability Distributions. Advances in Neural Information Processing Systems 22: 23rd Annual Conference on Neural Information Processing Systems 2009, 1750-1758.

引用: https://hdl.handle.net/11858/00-001M-0000-0013-C0CA-3

要旨

Embeddings of probability measures into reproducing kernel Hilbert spaces have been proposed as a straightforward and practical means of representing and comparing probabilities. In particular, the distance between embeddings (the maximum mean discrepancy, or MMD) has several key advantages over many classical metrics on distributions, namely easy computability, fast convergence and low bias of finite sample estimates. An important requirement of the embedding RKHS is that it be characteristic: in this case, the MMD between two distributions is zero if and only if the distributions coincide. Three new results on the MMD are introduced in the present study. First, it is established that MMD corresponds to the optimal risk of a kernel classifier, thus forming a natural link between the distance between distributions and their ease of classification. An important consequence is that a kernel must be characteristic to guarantee classifiability between distributions in the RKHS. Second, the class of characteristic kernels is broadened to incorporate all strictly positive definite kernels: these include non-translation invariant kernels and kernels on non-compact domains. Third, a generalization of the MMD is proposed for families of kernels, as the supremum over MMDs on a class of kernels (for instance the Gaussian kernels with different bandwidths). This extension is necessary to obtain a single distance measure if a large selection or class of characteristic kernels is potentially appropriate. This generalization is reasonable, given that it corresponds to the problem of learning the kernel by minimizing the risk of the corresponding kernel classifier. The generalized MMD is shown to have consistent finite sample estimates, and its performance is demonstrated on a homogeneity testing example.