日本語
 
Help Privacy Policy ポリシー/免責事項
  詳細検索ブラウズ

アイテム詳細


公開

ポスター

Struo: a pipeline for building custom databases for common metagenome profilers

MPS-Authors
/persons/resource/persons270530

de la Cuesta-Zuluaga,  J
Department Microbiome Science, Max Planck Institute for Developmental Biology, Max Planck Society;

/persons/resource/persons270516

Ley,  RE
Department Microbiome Science, Max Planck Institute for Developmental Biology, Max Planck Society;

/persons/resource/persons270526

Youngblut,  ND
Department Microbiome Science, Max Planck Institute for Developmental Biology, Max Planck Society;

Fulltext (restricted access)
There are currently no full texts shared for your IP range.
フルテキスト (公開)
公開されているフルテキストはありません
付随資料 (公開)
There is no public supplementary material available
引用

de la Cuesta-Zuluaga, J., Ley, R., & Youngblut, N. (2019). Struo: a pipeline for building custom databases for common metagenome profilers. Poster presented at German Conference on Bioinformatics (GCB 2019), Heidelberg, Germany.


引用: https://hdl.handle.net/21.11116/0000-000A-6F2A-C
要旨
Background: Metagenome profiling is the most efficient method of obtaining comprehensive taxonomic and functional data from metagenomes, yet default databases accompanying metagenome profilers are not updated at a pace that reflects the rapid increase in microbial genomics data. The creation of updated comprehensive, custom databases is cumbersome due to the complexity and high computational requirements of retrieving the genomes, and configuring and executing the software. As a result, many metagenomic analyses fail to include the most up to date microbial data, missing critical insights. We address this with the development of Struo, an automatized and modular pipeline that assists in the retrieval of genomes and construction of databases for metagenome profilers.
Methods and results: Struo uses Snakemake and Conda to unify the workflow and build databases in a straight-forward, reproducible manner on Unix-based high-performance compute clusters. Currently, Struo supports Kraken2, Bracken2 and HUMANn2, and can be extended to include other tools. Publicly available or novel genomes can be used; here, we used Struo with 21,276 representative genomes of the GTDB to generate databases that broadly encompass known microbial diversity. This resulted in an increase of 25% more reads mapped from simulated and real metagenomes compared to default profiler databases.
Discussion: A carefully curated and tailored selection of genomes to be included in reference databases for metagenome profiling facilitates the exploration of microbiomes by increasing the fraction of reads mapped to a known reference. Struo empowers researchers to incorporate previously unexplored taxa in the study of hidden microbial diversity. Struo and the custom databases will be made public as open source resources.