English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Preprint

TIPP_plastid: A User-Friendly Tool for De Novo Assembly of Organellar Genomes with HiFi Data

MPS-Authors
/persons/resource/persons277331

Xian,  W       
Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;

/persons/resource/persons83809

Bezrukov,  I       
Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;

/persons/resource/persons286939

Bao,  Z       
Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;

/persons/resource/persons288573

Vorbrugg,  S       
Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;

/persons/resource/persons271658

Gautam,  A       
IMPRS From Molecules to Organisms, Max Planck Institute for Biology Tübingen, Max Planck Society;

/persons/resource/persons85266

Weigel,  D       
Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Xian, W., Bezrukov, I., Bao, Z., Vorbrugg, S., Gautam, A., & Weigel, D. (submitted). TIPP_plastid: A User-Friendly Tool for De Novo Assembly of Organellar Genomes with HiFi Data.


Cite as: https://hdl.handle.net/21.11116/0000-000F-ACBE-A
Abstract
Plant cells have two major organelles with their own genomes: chloroplasts and mitochondria. While chloroplast genomes tend to be structurally conserved, the mitochondrial genomes of plants, which are much larger than those of animals, are characterized by complex structural variation. We introduce TIPP_plastid, a user-friendly, reference-free assembly tool that uses PacBio high-fidelity (HiFi) long-read data and that does not rely on genomes from related species or nuclear genome information for the assembly of organellar genomes. TIPP_plastid employs a deep learning model for initial read classification and leverages k-mer counting for further refinement, significantly reducing the impact of nuclear insertions of organellar DNA on the assembly process. We used TIPP_plastid to completely assemble a set of 54 complete chloroplast genomes. No other tool was able to completely assemble this set. TIPP_platiid outperforms PMAT in mitochondrial genome assembly, especially with respect to the completeness of protein coding genes. We also used the assembled organelle genomes to identify instances of nuclear plastid DNA (NUPTs) and nuclear mitochondrial DNA (NUMTs) insertions. The cumulative length of NUPTs/NUMTs positively correlates with the size of the nuclear genome, suggesting that insertions occur stochastically. NUPTs/NUMTs show predominantly C:G to T:A changes, with the mutated cytosines typically found in CG and CHG contexts, suggesting that degradation of NUPT and NUMT sequences is driven by the known elevated mutation rate of methylated cytosines. siRNA loci are enriched in NUPTs and NUMTs, consistent with the RdDM pathway mediating DNA methylation in these sequences.