日本語
 
Help Privacy Policy ポリシー/免責事項
  詳細検索ブラウズ

アイテム詳細


公開

学術論文

Pushing the limits of HiFi assemblies reveals centromere diversity between two Arabidopsis thaliana genomes

MPS-Authors
/persons/resource/persons271790

Rabanal,  FA
Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;

/persons/resource/persons273169

Gräff,  M
Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;

/persons/resource/persons271710

Lanz,  C
Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;

/persons/resource/persons273173

Fritschi,  K
Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;

/persons/resource/persons271687

Carbonell-Bejerano,  P
Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;

/persons/resource/persons85266

Weigel,  D
Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;

External Resource
There are no locators available
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
フルテキスト (公開)
公開されているフルテキストはありません
付随資料 (公開)
There is no public supplementary material available
引用

Rabanal, F., Gräff, M., Lanz, C., Fritschi, K., Llaca, V., Lang, M., Carbonell-Bejerano, P., Henderson, I., & Weigel, D. (2022). Pushing the limits of HiFi assemblies reveals centromere diversity between two Arabidopsis thaliana genomes. Nucleic Acids Research (London), 50(21):, pp. 12309-12327. doi:10.1093/nar/gkac1115.


引用: https://hdl.handle.net/21.11116/0000-000A-61E6-5
要旨
Although long-read sequencing can often enable chromosome-level reconstruction of genomes, it is still unclear how one can routinely obtain gapless assemblies. In the model plant Arabidopsis thaliana, other than the reference accession Col-0, all other accessions de novo assembled with long-reads until now have used PacBio continuous long reads (CLR). Although these assemblies sometimes achieved chromosome-arm level contigs, they inevitably broke near the centromeres, excluding megabases of DNA from analysis in pan-genome projects. Since PacBio high-fidelity (HiFi) reads circumvent the high error rate of CLR technologies, albeit at the expense of read length, we compared a CLR assembly of accession Eyach15-2 to HiFi assemblies of the same sample. The use of five different assemblers starting from subsampled data allowed us to evaluate the impact of coverage and read length. We found that centromeres and rDNA clusters are responsible for 71% of contig breaks in the CLR scaffolds, while relatively short stretches of GA/TC repeats are at the core of >85% of the unfilled gaps in our best HiFi assemblies. Since the HiFi technology consistently enabled us to reconstruct gapless centromeres and 5S rDNA clusters, we demonstrate the value of the approach by comparing these previously inaccessible regions of the genome between the Eyach15-2 accession and the reference accession Col-0.