非表示:
キーワード:
-
要旨:
Consistency is a key property of statistical algorithms, when the data is drawn from some underlying
probability distribution. Surprisingly, despite decades of work, little is known about consistency of most clustering
algorithms. In this paper we investigate consistency of a popular family of spectral clustering algorithms, which
cluster the data with the help of eigenvectors of graph Laplacian matrices. We show that one of the two of major
classes of spectral clustering (normalized clustering) converges under some very general conditions, while the other
(unnormalized), is only consistent under strong additional assumptions, which, as we demonstrate, are not always
satisfied in real data. We conclude that our analysis provides strong evidence for the superiority of normalized
spectral clustering in practical applications. We believe that methods used in our analysis will provide a basis for
future exploration of Laplacian-based methods in a statistical setting.