Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT
  Accurate RNA-seq based de novo annotation using mGene.ngs

Behr, J., Bohnert, R., Kahles, A., Schweikert, G., Zeller, G., Hartmann, L., et al. (2011). Accurate RNA-seq based de novo annotation using mGene.ngs. Poster presented at 19th Annual International Conference on Intelligent Systems for Molecular Biology and 10th European Conference on Computational Biology (ISMB/ECCB 2011), Wien, Austria.

Item is

Externe Referenzen

ausblenden:
externe Referenz:
https://f1000research.com/posters/1908 (Zusammenfassung)
Beschreibung:
-
OA-Status:
Keine Angabe

Urheber

ausblenden:
 Urheber:
Behr, J1, Autor                 
Bohnert, R1, Autor           
Kahles, A1, Autor                 
Schweikert, G1, Autor           
Zeller, G1, Autor                 
Hartmann, L1, Autor           
Rätsch, G1, Autor           
Affiliations:
1Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society, ou_3378052              

Inhalt

ausblenden:
Schlagwörter: -
 Zusammenfassung: The model organism Caenorhabditis elegans is one of the most important subjects to study cell fate and regulation of apoptosis. To gain a deeper understanding of the regulatory mechanisms in C. elegans its nearby evolutionary context was explored and the genomes of five closely related nematodes were sequenced. Currently, the major limitation in analyzing these specific genomes is that there is a lack of accuracy in the transcriptome annotation. In this project we sequenced the transcriptome (RNA-Seq) of all five nematodes and C. elegans using the Illumina sequencing platform (~300M reads, strand specific, paired-end, 76bp). Based on the RNA-Seq data we annotated all six nematodes using the newly developed de novo gene finding system mGene.ngs ( Schweikert et al 2009). mGene.ngs combines features from the RNA-Seq data and the genomic DNA sequence already at the learning stage. The system can be trained on a set of highly expressed protein coding and non-coding genes, whose structure can be directly inferred from the RNA-Seq data. The training was done independently for all 6 organisms. This is a conceptual difference to standard annotation strategies relying either on sequence alignments, classifiers trained on a single representative organism, or both. Therefore, these approaches generally tend to underestimate the proteome variability and are biased towards a single organism and/or the set of known proteins.

While our approach tends to overestimate the variability, it allows us to compare the transcriptomes and proteomes of a set of organisms on an equal footing and can sensitively detect minor changes in gene structure between organisms. Predictions include alternative isoforms supported by spliced reads as well as non-coding genes and transcripts. To evaluate the approach we take advantage of the highly accurate C. elegans genome annotation. We observe that the prediction accuracy in terms of coding transcript level sensitivity (56.1%) and specificity (62.7%) compares very favorably to the well known de novo transcriptome recognition system cufflinks Trapnell et al 2010 (sensitivity (49.9%), specificity (49.5%)).

Details

ausblenden:
Sprache(n):
 Datum: 2011-07
 Publikationsstatus: Online veröffentlicht
 Seiten: -
 Ort, Verlag, Ausgabe: -
 Inhaltsverzeichnis: -
 Art der Begutachtung: -
 Identifikatoren: -
 Art des Abschluß: -

Veranstaltung

ausblenden:
Titel: 19th Annual International Conference on Intelligent Systems for Molecular Biology and 10th European Conference on Computational Biology (ISMB/ECCB 2011)
Veranstaltungsort: Wien, Austria
Start-/Enddatum: 2011-07-17 - 2011-07-19

Entscheidung

einblenden:

Projektinformation

einblenden:

Quelle 1

ausblenden:
Titel: F1000Posters
Genre der Quelle: Konferenzband
 Urheber:
Affiliations:
Ort, Verlag, Ausgabe: -
Seiten: - Band / Heft: 2011 (2) Artikelnummer: 1163 Start- / Endseite: - Identifikator: -