ausblenden:
Schlagwörter:
-
Zusammenfassung:
We applied PAC-Bayesian framework to derive gen-eralization bounds for co-clustering1. The analysis yielded regularization terms that were absent in the preceding formulations of this task. The bounds suggested that co-clustering should optimize a trade-off between its empirical performance and the mutual information that the cluster variables preserve on row and column indices. Proper regularization enabled us to achieve state-of-the-art results in prediction of the missing ratings in the MovieLens collaborative filtering dataset.
In addition a PAC-Bayesian bound for discrete density estimation was derived. We have shown that the PAC-Bayesian bound for classification is a special case of the PAC-Bayesian bound for discrete density estimation. We further introduced combinatorial priors to PAC-Bayesian analysis. The combinatorial priors are more appropriate for discrete domains, as opposed to Gaussian priors, the latter of which are suitable for continuous domains. It was shown that combinatorial priors lead to regularization terms in the form of mutual information.