# Item

ITEM ACTIONSEXPORT

Released

Journal Article

#### Estimating the support of a high-dimensional distribution.

##### MPS-Authors

There are no MPG-Authors in the publication available

##### External Resource

https://www.mitpressjournals.org/doi/pdf/10.1162/089976601750264965

(Publisher version)

##### Fulltext (restricted access)

There are currently no full texts shared for your IP range.

##### Fulltext (public)

There are no public fulltexts stored in PuRe

##### Supplementary Material (public)

There is no public supplementary material available

##### Citation

Schölkopf, B., Platt, J., Shawe-Taylor, J., Smola, A., & Williamson, R. (2001).
Estimating the support of a high-dimensional distribution.* Neural computation,* *13*(7),
1443-1471. doi:10.1162/089976601750264965.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0013-E2C6-C

##### Abstract

Suppose you are given some data set drawn from an underlying probability distribution P and you want to estimate a simple subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified value between 0 and 1.

We propose a method to approach this problem by trying to estimate a function f that is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. The expansion coefficients are found by solving a quadratic programming problem, which we do by carrying out sequential optimization over pairs of input patterns. We also provide a theoretical analysis of the statistical performance of our algorithm.

The algorithm is a natural extension of the support vector algorithm to the case of unlabeled data.

We propose a method to approach this problem by trying to estimate a function f that is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. The expansion coefficients are found by solving a quadratic programming problem, which we do by carrying out sequential optimization over pairs of input patterns. We also provide a theoretical analysis of the statistical performance of our algorithm.

The algorithm is a natural extension of the support vector algorithm to the case of unlabeled data.