Manifold Denoising as Preprocessing for Finding Natural Representations of Data

Hein, M; Maier, M

Item

ITEM ACTIONSEXPORT

Add to Basket

Please note that a newer version of this item is available:
https://pure.mpg.de/pubman/item/item_1790359_2

DetailsSummary

Released

Conference Paper

Manifold Denoising as Preprocessing for Finding Natural Representations of Data

MPS-Authors

/persons/resource/persons83958

Hein, M
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons84070

Maier, M
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

There are no public fulltexts stored in PuRe

Supplementary Material (public)

There is no public supplementary material available

Citation

Hein, M., & Maier, M. (2007). Manifold Denoising as Preprocessing for Finding Natural Representations of Data. Proceedings of the Twenty-Second AAAI Conference on Artificial Intelligence (AAAI-07), 1646-1649.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0013-CCBF-8

Abstract

A natural representation of data are the parameters which generated the data. If the parameter space is continuous we can regard it as a manifold. In practice we usually do not know this manifold but we just have some representation of the data, often in a very high-dimensional feature space. Since the number of internal parameters does not change with the representation, the data will effectively lie on a low-dimensional submanifold in feature space. Due to measurement errors this data is usually corrupted by noise which particularly in high-dimensional feature spaces makes it almost impossible to find the manifold structure. This paper reviews a method called Manifold Denoising which projects the data onto the submanifold using a diffusion process on a graph generated by the data. We will demonstrate that the method is capable of dealing with non-trival high-dimensional noise. Moreover we will show that using the method as a preprocessing step one can significantly improve the results of a semi-supervised learning algorithm.