English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  Hilbert Space Representations of Probability Distributions

Gretton, A., Borgwardt, K., Fukumizu, K., Rasch, M., Schölkopf, B., Smola, A., et al. (2007). Hilbert Space Representations of Probability Distributions. Talk presented at 2nd Workshop on Machine Learning and Optimization at the ISM. Tokyo, Japan. 2007-10-12.

Item is

Files

show Files
hide Files
:
ISM-2007-Gretton.pdf (Any fulltext), 4MB
Name:
ISM-2007-Gretton.pdf
Description:
-
OA-Status:
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
-
Copyright Info:
-
License:
-

Locators

show
hide
Description:
-
OA-Status:

Creators

show
hide
 Creators:
Gretton, A1, 2, Author           
Borgwardt, K1, 2, Author           
Fukumizu, K1, 2, Author           
Rasch, M2, 3, Author           
Schölkopf, B1, 2, Author           
Smola, A1, 2, Author           
Song , L, Author
Teo, CH, Author
Affiliations:
1Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497795              
2Max Planck Institute for Biological Cybernetics, Max Planck Society, Spemannstrasse 38, 72076 Tübingen, DE, ou_1497794              
3Department Physiology of Cognitive Processes, Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497798              

Content

show
hide
Free keywords: -
 Abstract: Many problems in unsupervised learning require the analysis of features of probability distributions. At the most fundamental level, we might wish to determine whether two distributions are the same, based on samples from each - this is known as the two-sample or homogeneity problem. We use kernel methods to address this problem, by mapping probability distributions to elements in a reproducing kernel Hilbert space (RKHS). Given a sufficiently rich RKHS, these representations are unique: thus comparing feature space representations allows us to compare distributions without ambiguity. Applications include testing whether cancer subtypes are distinguishable on the basis of DNA microarray data, and whether low frequency oscillations measured at an electrode in the cortex have a different distribution during a neural spike.
A more difficult problem is to discover whether two random variables drawn from a joint distribution are independent. It turns out that any dependence between pairs of random variables can be encoded in a cross-covariance operator between appropriate RKHS representations of the variables, and we may test independence by looking at a norm of the operator. We demonstrate this independence test by establishing dependence between an English text and its French translation, as opposed to French text on the same topic but otherwise unrelated. Finally, we show that this operator norm is itself a difference in feature means.

Details

show
hide
Language(s):
 Dates: 2007-10
 Publication Status: Published online
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: BibTex Citekey: 4792
 Degree: -

Event

show
hide
Title: 2nd Workshop on Machine Learning and Optimization at the ISM
Place of Event: Tokyo, Japan
Start-/End Date: 2007-10-12
Invited: Yes

Legal Case

show

Project information

show

Source

show