English
 
User Manual Privacy Policy Disclaimer Contact us
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Molecular diagnosis: classification, model selection, and performance evaluation

Markowetz, F., & Spang, R. (2005). Molecular diagnosis: classification, model selection, and performance evaluation. Methods of Information in Medicine, 44(3), 438-443.

Item is

Basic

show hide
Item Permalink: http://hdl.handle.net/11858/00-001M-0000-0010-872C-8 Version Permalink: http://hdl.handle.net/11858/00-001M-0000-0010-872D-6
Genre: Journal Article
Alternative Title : Methods Inf Med.

Files

show Files

Locators

show

Creators

show
hide
 Creators:
Markowetz, Florian1, Author
Spang, Rainer2, Author              
Affiliations:
1Max Planck Society, ou_persistent13              
2Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society, ou_1433547              

Content

show
hide
Free keywords: Microarrays statistical classification generalization error model assessment gene selection
 Abstract: OBJECTIVES: We discuss supervised classification techniques applied to medical diagnosis based on gene expression profiles. Our focus lies on strategies of adaptive model selection to avoid overfitting in high-dimensional spaces. METHODS: We introduce likelihood-based methods, classification trees, support vector machines and regularized binary regression. For regularization by dimension reduction, we describe feature selection methods: feature filtering, feature shrinkage and wrapper approaches. In small sample-size situations efficient methods of data re-use are needed to assess the predictive power of a model. We discuss two issues in using cross-validation: the difference between in-loop and out-of-loop feature selection, and estimating model parameters in nested-loop cross-validation. RESULTS: Gene selection does not reduce the dimensionality of the model. Tuning parameters enable adaptive model selection. The feature selection bias is a common pitfall in performance evaluation. Model selection and performance evaluation can be combined by nested-loop cross-validation. CONCLUSIONS: Classification of microarrays is prone to overfitting. A rigorous and unbiased assessment of the predictive power of the model is a must.

Details

show
hide
Language(s): eng - English
 Dates: 2005-01-01
 Publication Status: Published in print
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: eDoc: 267501
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show
hide
Title: Methods of Information in Medicine
  Alternative Title : Methods Inf Med.
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: -
Pages: - Volume / Issue: 44 (3) Sequence Number: - Start / End Page: 438 - 443 Identifier: ISSN: 026-1270