User Manual Privacy Policy Disclaimer Contact us
  Advanced SearchBrowse




Journal Article

Validation and functional annotation of expression-based clusters based on gene ontology


Selbig,  J.
BioinformaticsCRG, Cooperative Research Groups, Max Planck Institute of Molecular Plant Physiology, Max Planck Society;

There are no locators available
Fulltext (public)
Supplementary Material (public)
There is no public supplementary material available

Steuer, R., Humburg, P., & Selbig, J. (2006). Validation and functional annotation of expression-based clusters based on gene ontology. BMC Bioinformatics, 7, 380. doi:10.1186/1471-2105-7-380.

Cite as: http://hdl.handle.net/11858/00-001M-0000-0014-29B8-7
Background: The biological interpretation of large-scale gene expression data is one of the paramount challenges in current bioinformatics. In particular, placing the results in the context of other available functional genomics data, such as existing bio-ontologies, has already provided substantial improvement for detecting and categorizing genes of interest. One common approach is to look for functional annotations that are significantly enriched within a group or cluster of genes, as compared to a reference group. Results: In this work, we suggest the information-theoretic concept of mutual information to investigate the relationship between groups of genes, as given by data-driven clustering, and their respective functional categories. Drawing upon related approaches (Gibbons and Roth, Genome Research 12: 1574-1581, 2002), we seek to quantify to what extent individual attributes are sufficient to characterize a given group or cluster of genes. Conclusion: We show that the mutual information provides a systematic framework to assess the relationship between groups or clusters of genes and their functional annotations in a quantitative way. Within this framework, the mutual information allows us to address and incorporate several important issues, such as the interdependence of functional annotations and combinatorial combinations of attributes. It thus supplements and extends the conventional search for overrepresented attributes within a group or cluster of genes. In particular taking combinations of attributes into account, the mutual information opens the way to uncover specific functional descriptions of a group of genes or clustering result. All datasets and functional annotations used in this study are publicly available. All scripts used in the analysis are provided as additional files.