Help Privacy Policy Disclaimer
  Advanced SearchBrowse





Mainstreaming Genomics: Protocols for Genome Re-Sequencing at the Population Scale


Murray,  KD
Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;

Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available

Warthmann, N., Murray, K., Morales Zambrana, A., Conde, M., Morales, L., Ali, A., et al. (2023). Mainstreaming Genomics: Protocols for Genome Re-Sequencing at the Population Scale. Poster presented at Plant & Animal Genome Conference (PAG 30), San Diego, CA, USA. doi:10.1101/2022.07.18.496977.

Cite as: https://hdl.handle.net/21.11116/0000-000C-4556-6
Mainstreaming genomics approaches requires lowering the barrier of entry. Current low prices for DNA sequencing enable comprehensive whole genome re-sequencing studies on many individuals and identifying virtually all segregating genetic variation in entire populations. Such information is valuable for many applications, particularly in breeding and ecology/conservation. Despite their disruptive impact, the adoption of such large-scale genomics approaches has been slow and they are not yet mainstream. While DNA sequencing capacity is readily accessible from around the world, we have identified the ancillary costs of sequencing library production for large numbers of samples and the difficulty of sequence data analysis as obstacles. In an attempt to accelerate adoption we present
(1) a molecular biology protocol that achieves parallel, cost-effective preparation of highly multiplexed Illumina sequencing libraries that uses only standard laboratory equipment and commercially available reagents for less than $10 per sample, and
(2) a scalable software workflow (snakemake/conda) that will, in a reproducible manner, turn raw sequencing reads (fastq) into informative summaries such as alignment-free genetic distance estimations (kWIP and mash), read alignments to one or several reference genomes (sam/bam), variant calls (vcf/bcf) and their functional annotations (vcf/SNPeff). The workflow is open source and available on github (https://github.com/pbgl) with detailed documentation. It can be run on a local machine as well as on a VM in the cloud.