hide
Free keywords:
-
Abstract:
Principal Component Analysis (PCA) is a widely used tool for, e.g., exploratory
data analysis, dimensionality reduction and clustering. However, it is well
known that PCA is strongly aected by the presence of outliers and, thus, is
vulnerable to both gross measurement error and adversarial manipulation of the
data. This phenomenon motivates the development of robust PCA as the problem of
recovering the principal components of the uncontaminated data.
In this thesis, we propose two new algorithms, QRPCA and MDRPCA, for robust PCA
components based on the projection-pursuit approach of Huber. While the
resulting optimization problems are non-convex and non-smooth, we show that
they can be eciently minimized via the RatioDCA using bundle
methods/accelerated proximal methods for the interior problem. The key
ingredient for the most promising algorithm (QRPCA) is a robust, location
invariant scale measure with breakdown point 0.5. Extensive experiments show
that our QRPCA is competitive with current state-of-the-art methods and
outperforms other methods in particular for a large number of outliers.