Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT

Freigegeben

Zeitschriftenartikel

mGene: Accurate SVM-based gene finding with an application to nematode genomes

MPG-Autoren
/persons/resource/persons84204

Schweikert,  G
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

/persons/resource/persons84331

Zien,  A
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

/persons/resource/persons229087

Zeller,  G
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

/persons/resource/persons85272

Behr,  J
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

/persons/resource/persons84118

Ong,  CS
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

/persons/resource/persons273040

Philips,  P
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

/persons/resource/persons273042

De Bona,  F
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

/persons/resource/persons273044

Hartmann,  L
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

/persons/resource/persons273046

Bohlen,  A
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

/persons/resource/persons273048

Krüger,  N
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

/persons/resource/persons84960

Sonnenburg,  S
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

/persons/resource/persons84153

Rätsch,  G
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

Externe Ressourcen
Es sind keine externen Ressourcen hinterlegt
Volltexte (beschränkter Zugriff)
Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.
Volltexte (frei zugänglich)
Es sind keine frei zugänglichen Volltexte in PuRe verfügbar
Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar
Zitation

Schweikert, G., Zien, A., Zeller, G., Behr, J., Dieterich, C., Ong, C., et al. (2009). mGene: Accurate SVM-based gene finding with an application to nematode genomes. Genome Research, 19(11), 2133-2143. doi:10.1101/gr.090597.108.


Zitierlink: https://hdl.handle.net/21.11116/0000-000A-5EDA-8
Zusammenfassung
We present a highly accurate gene-prediction system for eukaryotic genomes, called mGene. It combines in an unprecedented manner the flexibility of generalized hidden Markov models (gHMMs) with the predictive power of modern machine learning methods, such as Support Vector Machines (SVMs). Its excellent performance was proved in an objective competition based on the genome of the nematode Caenorhabditis elegans. Considering the average of sensitivity and specificity, the developmental version of mGene exhibited the best prediction performance on nucleotide, exon, and transcript level for ab initio and multiple-genome gene-prediction tasks. The fully developed version shows superior performance in 10 out of 12 evaluation criteria compared with the other participating gene finders, including Fgenesh++ and Augustus. An in-depth analysis of mGene's genome-wide predictions revealed that ≈2200 predicted genes were not contained in the current genome annotation. Testing a subset of 57 of these genes by RT-PCR and sequencing, we confirmed expression for 24 (42%) of them. mGene missed 300 annotated genes, out of which 205 were unconfirmed. RT-PCR testing of 24 of these genes resulted in a success rate of merely 8%. These findings suggest that even the gene catalog of a well-studied organism such as C. elegans can be substantially improved by mGene's predictions. We also provide gene predictions for the four nematodes C. briggsae, C. brenneri, C. japonica, and C. remanei. Comparing the resulting proteomes among these organisms and to the known protein universe, we identified many species-specific gene inventions. In a quality assessment of several available annotations for these genomes, we find that mGene's predictions are most accurate.