Help Privacy Policy Disclaimer
  Advanced SearchBrowse




Conference Paper

Clustering Graphs by Weighted Substructure Mining


Tsuda,  K
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

External Resource
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available

Tsuda, K., & Kudo, T. (2006). Clustering Graphs by Weighted Substructure Mining. In W. Cohen, & A. Moore (Eds.), ICML '06: Proceedings of the 23rd International Conference on Machine Learning (pp. 953-960). New York, NY, USA: ACM Press.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0013-D133-E
Graph data is getting increasingly popular in, e.g., bioinformatics and text processing. A main difficulty of graph data processing lies in the intrinsic high dimensionality of graphs, namely, when a graph is represented as a binary feature vector of indicators of all possible subgraphs, the dimensionality gets too large for usual statistical methods. We propose an efficient method for learning a binomial mixture model in this feature space. Combining the l1 regularizer and the data structure called DFS code tree, the MAP estimate of non-zero parameters are computed efficiently by means of the EM algorithm. Our method is applied to the clustering of RNA graphs, and is compared favorably with graph kernels and the spectral graph distance.