English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Poster

Efficient analysis of allele frequency variation from whole-genome pool-sequencing data

MPS-Authors
/persons/resource/persons271696

Hildebrandt,  J
Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;

/persons/resource/persons273173

Fritschi,  K       
Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;

/persons/resource/persons271416

Schwab,  R       
Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;
Research Group Ecological Genetics, Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;

/persons/resource/persons272550

Rowan,  B       
Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;

/persons/resource/persons85266

Weigel,  D       
Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;

/persons/resource/persons273178

Exposito Alonso,  M       
Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;

Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Czech, L., Peng, Y., Spence, J., Lang, P., Bellagio, T., Hildebrandt, J., et al. (2022). Efficient analysis of allele frequency variation from whole-genome pool-sequencing data. Poster presented at Population, Evolutionary, and Quantitative Genetics Conference (PEQG 2022), Pacific Grove, CA, USA.


Cite as: https://hdl.handle.net/21.11116/0000-000B-7099-A
Abstract
In recent decades, so-called Evolve-and-Resequence (E&R) experiments have become a popular approach to survey rapid

evolution of populations over multiple generations. These experiments allow us to measure shifts in the allele frequencies of a population in response to new or shifting environmental conditions, such as a changing climate. Pool-sequencing of several individuals at once is a cost-effective and efficient tool to obtain reliable allele frequencies from a population of thousands to hundreds of thousands of individuals, and is often used in E&R experiments. However,

specialized tools to efficiently analyze these data that take sampling biases stemming from the pool-sequencing approach into account were lacking. We developed two software tools to overcome statistical and bioinformatic challenges arising in this context. First, we present grenepipe, a workflow from raw sequencing data of individuals or pooled populations to genotypes (variant calling) and population allele frequencies. The pipeline automates trimming, mapping, variant calling, and quality control, with a selection of popular software tools in each of these steps, and produces variant calls and frequency tables. While generally applicable to individual sample data, it offers specialized steps for pool-sequencing. With a single command line call, our software downloads all dependencies and runs all steps automatically, parallelizes processing for

computer cluster environments, and recovers from any failing steps. Second, to enable inferences of evolutionary signatures from frequency data, we created grenedalf, a C++ command line tool to compute population genetic statistics. It computes unbiased statistics of Fst, Pi, Tajima’s D with pool-sequencing data, far outperforming alternative tools. Further it offers novel data exploration tools such as windowed allele frequency spectrum visualizations and PCA and MDS on the allele frequencies, and built-in data filters and manipulations. These tools are designed for scalability and ease-of-use with contemporary file formats, which we showcase using the GrENE-net.org project, a large-scale Evolve-and-Resequence experiment with Arabidopsis thaliana from across the world.