hide
Free keywords:
-
Abstract:
We present a new method for spectral clustering with
paired data based on kernel canonical correlation analysis,
called correlational spectral clustering. Paired data are
common in real world data sources, such as images with
text captions. Traditional spectral clustering algorithms either
assume that data can be represented by a single similarity
measure, or by co-occurrence matrices that are then
used in biclustering. In contrast, the proposed method uses
separate similarity measures for each data representation,
and allows for projection of previously unseen data that are
only observed in one representation (e.g. images but not
text). We show that this algorithm generalizes traditional
spectral clustering algorithms and show consistent empirical
improvement over spectral clustering on a variety of
datasets of images with associated text.