English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Preprint

An Algorithm to Build a Multi-genome Reference

MPS-Authors
/persons/resource/persons273764

Rabbani,  L
Department Molecular Biology, Max Planck Institute for Developmental Biology, Max Planck Society;

/persons/resource/persons271502

Müller,  J
Department Molecular Biology, Max Planck Institute for Developmental Biology, Max Planck Society;

/persons/resource/persons85266

Weigel,  D
Department Molecular Biology, Max Planck Institute for Developmental Biology, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Rabbani, L., Müller, J., & Weigel, D. (submitted). An Algorithm to Build a Multi-genome Reference.


Cite as: https://hdl.handle.net/21.11116/0000-000A-8F4D-0
Abstract
Motivation New DNA sequencing technologies have enabled the rapid analysis of many thousands of genomes from a single species. At the same time, the conventional approach of mapping sequencing reads against a single reference genome sequence is no longer adequate. However, even where multiple high-quality reference genomes are available, the problem remains how one would integrate results from pairwise analyses.
Result: To overcome the limits imposed by mapping sequence reads against a single reference genome, or serially mapping them against multiple reference genomes, we have developed the MGR method that allows simultaneous comparison against multiple high-quality reference genomes, in order to remove the bias that comes from using only a single-genome reference and to simplify downstream analyses. To this end, we present the MGR algorithm that creates a graph (MGR graph) as a multi-genome reference. To reduce the size and complexity of the multi-genome reference, highly similar orthologous1 and paralogous2 regions are collapsed while more substantial differences are retained. To evaluate the performance of our model, we have developed a genome compression tool, which can be used to estimate the amount of shared information between genomes.