English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Conference Paper

PAC-Bayesian Bounds for Discrete Density Estimation and Co-clustering Analysis

MPS-Authors
/persons/resource/persons84206

Seldin,  Y
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
Supplementary Material (public)
There is no public supplementary material available
Citation

Seldin, Y., & Tishby, N. (2010). PAC-Bayesian Bounds for Discrete Density Estimation and Co-clustering Analysis. In Foundations and New Trends of PAC Bayesian Learning Workshop (pp. 1-2).


Cite as: https://hdl.handle.net/11858/00-001M-0000-0013-C10E-1
Abstract
We applied PAC-Bayesian framework to derive gen-eralization bounds for co-clustering1. The analysis yielded regularization terms that were absent in the preceding formulations of this task. The bounds suggested that co-clustering should optimize a trade-off between its empirical performance and the mutual information that the cluster variables preserve on row and column indices. Proper regularization enabled us to achieve state-of-the-art results in prediction of the missing ratings in the MovieLens collaborative filtering dataset.
In addition a PAC-Bayesian bound for discrete density estimation was derived. We have shown that the PAC-Bayesian bound for classification is a special case of the PAC-Bayesian bound for discrete density estimation. We further introduced combinatorial priors to PAC-Bayesian analysis. The combinatorial priors are more appropriate for discrete domains, as opposed to Gaussian priors, the latter of which are suitable for continuous domains. It was shown that combinatorial priors lead to regularization terms in the form of mutual information.