# Item

ITEM ACTIONSEXPORT

Released

Talk

#### Hilbert Space Representations of Probability Distributions

##### MPS-Authors

##### External Resource

https://www.ism.ac.jp/~tmatsui/kinou2_p4/workshop_OCT07.html

(Table of contents)

##### Fulltext (public)

ISM-2007-Gretton.pdf

(Any fulltext), 4MB

##### Supplementary Material (public)

There is no public supplementary material available

##### Citation

Gretton, A., Borgwardt, K., Fukumizu, K., Rasch, M., Schölkopf, B., Smola, A., et al. (2007).
*Hilbert Space Representations of Probability Distributions*. Talk presented at 2nd Workshop
on Machine Learning and Optimization at the ISM. Tokyo, Japan. 2007-10-12.

Cite as: http://hdl.handle.net/11858/00-001M-0000-0013-CBA3-C

##### Abstract

Many problems in unsupervised learning require the analysis of features of probability distributions. At the most fundamental level, we might wish to determine whether two distributions are the same, based on samples from each - this is known as the two-sample or homogeneity problem. We use kernel methods to address this problem, by mapping probability distributions to elements in a reproducing kernel Hilbert space (RKHS). Given a sufficiently rich RKHS, these representations are unique: thus comparing feature space representations allows us to compare distributions without ambiguity. Applications include testing whether cancer subtypes are distinguishable on the basis of DNA microarray data, and whether low frequency oscillations measured at an electrode in the cortex have a different distribution during a neural spike.
A more difficult problem is to discover whether two random variables drawn from a joint distribution are independent. It turns out that any dependence between pairs of random variables can be encoded in a cross-covariance operator between appropriate RKHS representations of the variables, and we may test independence by looking at a norm of the operator. We demonstrate this independence test by establishing dependence between an English text and its French translation, as opposed to French text on the same topic but otherwise unrelated. Finally, we show that this operator norm is itself a difference in feature means.