Accurate RNA-seq based de novo annotation using mGene.ngs

Behr, J; Bohnert, R; Kahles, A; Schweikert, G; Zeller, G; Hartmann, L; Rätsch, G

Lokale TagsFreigabegeschichteDetailsÜbersicht

Accurate RNA-seq based de novo annotation using mGene.ngs

Behr, J., Bohnert, R., Kahles, A., Schweikert, G., Zeller, G., Hartmann, L., et al. (2011). Accurate RNA-seq based de novo annotation using mGene.ngs. Poster presented at 19th Annual International Conference on Intelligent Systems for Molecular Biology and 10th European Conference on Computational Biology (ISMB/ECCB 2011), Wien, Austria.

Item is Freigegeben

einblenden: alle

Basisdaten

ausblenden:

Datensatz-Permalink: https://hdl.handle.net/21.11116/0000-0010-52A8-4 Versions-Permalink: https://hdl.handle.net/21.11116/0000-0010-52A9-3

Genre: Poster

Dateien

einblenden: Dateien

Externe Referenzen

ausblenden:

externe Referenz:
https://f1000research.com/posters/1908 (Zusammenfassung) Open Access Status unbekannt

Beschreibung:
-

OA-Status:
Keine Angabe

Urheber

ausblenden:

Urheber:
Behr, J¹, Autor
Bohnert, R¹, Autor
Kahles, A¹, Autor
Schweikert, G¹, Autor
Zeller, G¹, Autor
Hartmann, L¹, Autor
Rätsch, G¹, Autor

Affiliations:
1Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society, ou_3378052

Inhalt

ausblenden:

Schlagwörter: -

Zusammenfassung: The model organism Caenorhabditis elegans is one of the most important subjects to study cell fate and regulation of apoptosis. To gain a deeper understanding of the regulatory mechanisms in C. elegans its nearby evolutionary context was explored and the genomes of five closely related nematodes were sequenced. Currently, the major limitation in analyzing these specific genomes is that there is a lack of accuracy in the transcriptome annotation. In this project we sequenced the transcriptome (RNA-Seq) of all five nematodes and C. elegans using the Illumina sequencing platform (~300M reads, strand specific, paired-end, 76bp). Based on the RNA-Seq data we annotated all six nematodes using the newly developed de novo gene finding system mGene.ngs ( Schweikert et al 2009). mGene.ngs combines features from the RNA-Seq data and the genomic DNA sequence already at the learning stage. The system can be trained on a set of highly expressed protein coding and non-coding genes, whose structure can be directly inferred from the RNA-Seq data. The training was done independently for all 6 organisms. This is a conceptual difference to standard annotation strategies relying either on sequence alignments, classifiers trained on a single representative organism, or both. Therefore, these approaches generally tend to underestimate the proteome variability and are biased towards a single organism and/or the set of known proteins.

While our approach tends to overestimate the variability, it allows us to compare the transcriptomes and proteomes of a set of organisms on an equal footing and can sensitively detect minor changes in gene structure between organisms. Predictions include alternative isoforms supported by spliced reads as well as non-coding genes and transcripts. To evaluate the approach we take advantage of the highly accurate C. elegans genome annotation. We observe that the prediction accuracy in terms of coding transcript level sensitivity (56.1%) and specificity (62.7%) compares very favorably to the well known de novo transcriptome recognition system cufflinks Trapnell et al 2010 (sensitivity (49.9%), specificity (49.5%)).

Details

ausblenden:

Sprache(n):

Datum: Online veröffentlicht: 2011-07

Publikationsstatus: Online veröffentlicht

Seiten: -

Ort, Verlag, Ausgabe: -

Inhaltsverzeichnis: -

Art der Begutachtung: -

Identifikatoren: -

Art des Abschluß: -

Veranstaltung

ausblenden:

Titel: 19th Annual International Conference on Intelligent Systems for Molecular Biology and 10th European Conference on Computational Biology (ISMB/ECCB 2011)

Veranstaltungsort: Wien, Austria

Start-/Enddatum: 2011-07-17 - 2011-07-19

Entscheidung

einblenden:

Projektinformation

einblenden:

Quelle 1

ausblenden:

Titel: F1000Posters

Genre der Quelle: Konferenzband

Urheber:

Affiliations:

Ort, Verlag, Ausgabe: -

Seiten: - Band / Heft: 2011 (2) Artikelnummer: 1163 Start- / Endseite: - Identifikator: -