English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Journal Article

Phylonium: fast estimation of evolutionary distances from large samples of similar genomes

MPS-Authors
/persons/resource/persons191618

Klötzl,  Fabian
Research Group Bioinformatics, Department Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Max Planck Society;

/persons/resource/persons56719

Haubold,  Bernhard
Research Group Bioinformatics, Department Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Max Planck Society;

External Resource

Link
(Publisher version)

Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)

btz903.pdf
(Publisher version), 485KB

Supplementary Material (public)
There is no public supplementary material available
Citation

Klötzl, F., & Haubold, B. (2020). Phylonium: fast estimation of evolutionary distances from large samples of similar genomes. Bioinformatics, 36(7), 2040-2046. doi:10.1093/bioinformatics/btz903.


Cite as: https://hdl.handle.net/21.11116/0000-0005-67F8-0
Abstract
Tracking disease outbreaks by whole-genome sequencing leads to the collection of large samples of closely related sequences. Five years ago, we published a method to accurately compute all pairwise distances for such samples by indexing each sequence. Since indexing is slow, we now ask whether it is possible to achieve similar accuracy when indexing only a single sequence.We have implemented this idea in the program phylonium and show that it is as accurate as its predecessor and roughly 100 times faster when applied to all 2678 Escherichia coli genomes contained in ENSEMBL. One of the best published programs for rapidly computing pairwise distances, mash, analyzes the same dataset four times faster but, with default settings, it is less accurate than phylonium.Phylonium runs under the UNIX command line; its C++ sources and documentation are available from github.com/evolbioinf/phylonium.Supplementary data are available at Bioinformatics online.