Clustering cancer gene expression data: a comparative study

de Souto, Marcilio CP; Costa, Ivan G; de Araujo, Daniel SA; Ludermir, Teresa B; Schliep, Alexander

doi:10.1186/1471-2105-9-497

Lokale TagsFreigabegeschichteDetailsÜbersicht

Clustering cancer gene expression data: a comparative study

de Souto, M. C., Costa, I. G., de Araujo, D. S., Ludermir, T. B., & Schliep, A. (2008). Clustering cancer gene expression data: a comparative study. BMC Bioinformatics, 9, 497-497. doi:10.1186/1471-2105-9-497.

Item is Freigegeben

einblenden: alle ausblenden: alle

Basisdaten

einblenden: ausblenden:

Datensatz-Permalink: https://hdl.handle.net/11858/00-001M-0000-0010-7EA8-4 Versions-Permalink: https://hdl.handle.net/11858/00-001M-0000-0010-7EA9-2

Genre: Zeitschriftenartikel

Dateien

einblenden: Dateien

ausblenden: Dateien

:

1471-2105-9-497.pdf (beliebiger Volltext), 2MB

Öffnen Speichern

Datei-Permalink:
https://hdl.handle.net/11858/00-001M-0000-0010-7EA7-6

Name:
1471-2105-9-497.pdf

Beschreibung:
-

OA-Status:

Sichtbarkeit:
Öffentlich

MIME-Typ / Prüfsumme:
application/pdf / [MD5]

Technische Metadaten:

Öffnen

Copyright Datum:
-

Copyright Info:
eDoc_access: PUBLIC

Lizenz:
-

Externe Referenzen

einblenden:

Urheber

einblenden:

ausblenden:

Urheber:
de Souto, Marcilio CP¹, Autor
Costa, Ivan G¹, Autor
de Araujo, Daniel SA, Autor
Ludermir, Teresa B, Autor
Schliep, Alexander², Autor

Affiliations:
1Max Planck Society, ou_persistent13
2Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society, ou_1433547

Inhalt

einblenden:

ausblenden:

Schlagwörter: -

Zusammenfassung: Background The use of clustering methods for the discovery of cancer subtypes has drawn a great deal of attention in the scientific community. While bioinformaticians have proposed new clustering methods that take advantage of characteristics of the gene expression data, the medical community has a preference for using "classic" clustering methods. There have been no studies thus far performing a large-scale evaluation of different clustering methods in this context. Results/Conclusion We present the first large-scale analysis of seven different clustering methods and four proximity measures for the analysis of 35 cancer gene expression data sets. Our results reveal that the finite mixture of Gaussians, followed closely by k-means, exhibited the best performance in terms of recovering the true structure of the data sets. These methods also exhibited, on average, the smallest difference between the actual number of classes in the data sets and the best number of clusters as indicated by our validation criteria. Furthermore, hierarchical methods, which have been widely used by the medical community, exhibited a poorer recovery performance than that of the other methods evaluated. Moreover, as a stable basis for the assessment and comparison of different clustering methods for cancer gene expression data, this study provides a common group of data sets (benchmark data sets) to be shared among researchers and used for comparisons with new methods. The data sets analyzed in this study are available at http://algorithmics.molgen.mpg.de/Supplements/CompCancer/ webcite.

Details

einblenden:

ausblenden:

Sprache(n): eng - English

Datum: Erschienen: 2008-11-27

Publikationsstatus: Erschienen

Seiten: -

Ort, Verlag, Ausgabe: -

Inhaltsverzeichnis: -

Art der Begutachtung: -

Identifikatoren: eDoc: 405704
URI: http://www.biomedcentral.com/content/pdf/1471-2105-9-497.pdf
DOI: 10.1186/1471-2105-9-497

Art des Abschluß: -

ausblenden:

Titel: BMC Bioinformatics

Genre der Quelle: Zeitschrift

Urheber:

Affiliations:

Ort, Verlag, Ausgabe: -

Seiten: - Band / Heft: 9 Artikelnummer: - Start- / Endseite: 497 - 497 Identifikator: ISSN: 1471-2105

Datensatz

Basisdaten

Dateien

Externe Referenzen

Urheber

Inhalt

Details

Veranstaltung

Entscheidung

Projektinformation

Quelle 1