Consistent Minimization of Clustering Objective Functions

von Luxburg, U; Bubeck, S; Jegelka, S; Kaufmann, M; Platt,; C., J.; Koller, D.; Singer, Y.; Roweis, S.

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Bitte beachten Sie, dass eine neuere Version dieses Datensatzes verfügbar ist:
https://pure.mpg.de/pubman/item/item_1789660_2

DetailsÜbersicht

Freigegeben

Konferenzbeitrag

Consistent Minimization of Clustering Objective Functions

MPG-Autoren

/persons/resource/persons76237

von Luxburg, U
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons83837

Bubeck, S
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons83994

Jegelka, S
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;

Externe Ressourcen

Es sind keine externen Ressourcen hinterlegt

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

Es sind keine frei zugänglichen Volltexte in PuRe verfügbar

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

von Luxburg, U., Bubeck, S., Jegelka, S., & Kaufmann, M. (2008). Consistent Minimization of Clustering Objective Functions. Advances in Neural Information Processing Systems 20: 21st Annual Conference on Neural Information Processing Systems 2007, 961-968.

Zitierlink: https://hdl.handle.net/11858/00-001M-0000-0013-C735-4

Zusammenfassung

Clustering is often formulated as a discrete optimization problem. The objective is to find, among all partitions of the data set, the best one according to some quality measure. However, in the statistical setting where we assume that the finite data set has been sampled from some underlying space, the goal is not to find the best partition of the given sample, but to approximate the true partition of the underlying space. We argue that the discrete optimization approach usually does not achieve this goal. As an alternative, we suggest the paradigm of nearest neighbor clusteringamp;amp;amp;amp;amp;amp;amp;amp;amp;amp;lsquo;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;lsquo;. Instead of selecting the best out of all partitions of the sample, it only considers partitions in some restricted function class. Using tools from statistical learning theory we prove that nearest neighbor clustering is statistically consistent. Moreover, its worst case complexity is polynomial by co nstructi on, and it can b e implem ented wi th small average case co mplexity using b ranch an d bound.