Full-length transcript-based proteogenomics of rice improves its genome and 
proteome annotation

Chen, Mo-Xian; Zhu, Fu-Yuan; Gao, Bei; Ma, Kai-Long; Zhang, YJ; Fernie, A. R.; Chen, Xi; Dai, Lei; Ye, Neng-Hui; Zhang, Xue; Tian, Yuan; Zhang, Di; Xiao, Shi; Zhang, Jianhua; Liu, Ying-Gao

doi:10.1104/pp.19.00430

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Zeitschriftenartikel

Full-length transcript-based proteogenomics of rice improves its genome and proteome annotation

MPG-Autoren

/persons/resource/persons135687

Zhang, YJ
Central Metabolism, Department Willmitzer, Max Planck Institute of Molecular Plant Physiology, Max Planck Society;

/persons/resource/persons97147

Fernie, A. R.
Central Metabolism, Department Willmitzer, Max Planck Institute of Molecular Plant Physiology, Max Planck Society;

Externe Ressourcen

Link
(beliebiger Volltext)

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

Es sind keine frei zugänglichen Volltexte in PuRe verfügbar

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Chen, M.-X., Zhu, F.-Y., Gao, B., Ma, K.-L., Zhang, Y., Fernie, A. R., et al. (2020). Full-length transcript-based proteogenomics of rice improves its genome and proteome annotation. Plant Physiology, 182(3), 1510-1526. doi:10.1104/pp.19.00430.

Zitierlink: https://hdl.handle.net/21.11116/0000-0005-6C41-9

Zusammenfassung

Rice (Oryza sativa) molecular breeding has gained considerable attention in recent years but inaccurate genome annotation hampers its progress and functional studies of the rice genome. In this study, we applied single-molecule long-read RNA sequencing (lrRNA_seq)-based proteogenomics to reveal the complexity of the rice transcriptome and its coding abilities. Surprisingly, approximately 60% of loci identified by lrRNA_seq are associated with natural antisense transcripts (NATs). The high-density genomic arrangement of NAT genes suggests their potential roles in the multifaceted control of gene expression. In addition, a large number of fusion and intergenic transcripts have been observed. Furthermore, a total of 906,456 transcript isoforms were identified, and 72.9% of the genes can generate splicing isoforms. 706,075 post-transcriptional events were subsequently categorized into ten subtypes, demonstrating the interdependence of post-transcriptional mechanisms that contribute to transcriptome diversity. Parallel short-read RNA sequencing indicated that lrRNA_seq has a superior capacity for the identification of longer transcripts. In addition, over 190,000 unique peptides belonging to 9,706 proteoforms/protein groups were identified, expanding the diversity of the rice proteome. Our findings indicate that the genome organization, transcriptome diversity, and coding potential of the rice transcriptome are far more complex than previously anticipated.