English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  mGene: Accurate SVM-based gene finding with an application to nematode genomes

Schweikert, G., Zeller, G., Behr, J., Dieterich, C., Ong, C., Philips, P., et al. (2009). mGene: Accurate SVM-based gene finding with an application to nematode genomes. Genome Research, 19(11), 2133-2143. doi:10.1101/gr.090597.108.

Item is

Files

show Files

Locators

show
hide
Description:
-
OA-Status:

Creators

show
hide
 Creators:
Schweikert, G1, 2, 3, 4, Author           
Zeller, G4, Author           
Behr, J4, Author           
Dieterich, C3, Author
Ong, CS1, 2, 4, Author           
Philips, P4, Author
De Bona, F4, Author
Hartmann, L4, Author
Bohlen, A4, Author
Krüger, N4, Author
Sonnenburg, S4, Author           
Rätsch, G4, Author           
Affiliations:
1Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497795              
2Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497794              
3Max Planck Institute for Developmental Biology, Max Planck Society, Max-Planck-Ring 5, 72076 Tübingen, DE, ou_2421691              
4Friedrich Miescher Laboratory, Max Planck Society, Max-Planck-Ring 9, 72076 Tübingen, DE, ou_2575692              

Content

show
hide
Free keywords: -
 Abstract: We present a highly accurate gene-prediction system for eukaryotic genomes, called mGene. It combines in an unprecedented manner the flexibility of generalized hidden Markov models (gHMMs) with the predictive power of modern machine learning methods, such as Support Vector Machines (SVMs). Its excellent performance was proved in an objective competition based on the genome of the nematode Caenorhabditis elegans. Considering the average of sensitivity and specificity, the developmental version of mGene exhibited the best prediction performance on nucleotide, exon, and transcript level for ab initio and multiple-genome gene-prediction tasks. The fully developed version shows superior performance in 10 out of 12 evaluation criteria compared with the other participating gene finders, including Fgenesh++ and Augustus. An in-depth analysis of mGene's genome-wide predictions revealed that ≈2200 predicted genes were not contained in the current genome annotation. Testing a subset of 57 of these genes by RT-PCR and sequencing, we confirmed expression for 24 (42%) of them. mGene missed 300 annotated genes, out of which 205 were unconfirmed. RT-PCR testing of 24 of these genes resulted in a success rate of merely 8%. These findings suggest that even the gene catalog of a well-studied organism such as C. elegans can be substantially improved by mGene's predictions. We also provide gene predictions for the four nematodes C. briggsae, C. brenneri, C. japonica, and C. remanei. Comparing the resulting proteomes among these organisms and to the known protein universe, we identified many species-specific gene inventions. In a quality assessment of several available annotations for these genomes, we find that mGene's predictions are most accurate.

Details

show
hide
Language(s):
 Dates: 2009-11
 Publication Status: Issued
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: DOI: 10.1101/gr.090597.108
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show
hide
Title: Genome Research
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: Cold Spring Harbor, N.Y. : Cold Spring Harbor Laboratory Press
Pages: - Volume / Issue: 19 (11) Sequence Number: - Start / End Page: 2133 - 2143 Identifier: ISSN: 1088-9051
CoNE: https://pure.mpg.de/cone/journals/resource/954926997202