English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  DREAM-Yara: an exact read mapper for very large databases with short update time

Dadi, T. H., Siragusa, E., Piro, V. C., Andrusch, A., Seiler, E., Renard, B. Y., et al. (2018). DREAM-Yara: an exact read mapper for very large databases with short update time. Bioinformatics, 34(17), i766-1772. doi:10.1093/bioinformatics/bty567.

Item is

Files

show Files
hide Files
:
Bioinformatics_Dadi et al_2018.pdf (Publisher version), 445KB
Name:
Bioinformatics_Dadi et al_2018.pdf
Description:
-
OA-Status:
Not specified
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
-
Copyright Info:
© The Author(s) 2018
License:
-

Locators

show

Creators

show
hide
 Creators:
Dadi, Temesgen Hailemariam1, Author                 
Siragusa, Enrico , Author
Piro, Vitor C. , Author
Andrusch, Andreas , Author
Seiler, Enrico1, Author                 
Renard, Bernhard Y. , Author
Reinert, Knut2, Author                 
Affiliations:
1IMPRS for Biology and Computation (Anne-Dominique Gindrat), Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society, ou_1479666              
2Efficient Algorithms for Omics Data (Knut Reinert), Max Planck Fellow Group, Max Planck Institute for Molecular Genetics, Max Planck Society, ou_2385698              

Content

show
hide
Free keywords: -
 Abstract: Motivation: Mapping-based approaches have become limited in their application to very large sets of references since computing an FM-index for very large databases (e.g. >10 GB) has become a bottleneck. This affects many analyses that need such index as an essential step for approximate matching of the NGS reads to reference databases. For instance, in typical metagenomics analysis, the size of the reference sequences has become prohibitive to compute a single full-text index on standard machines. Even on large memory machines, computing such index takes about 1 day of computing time. As a result, updates of indices are rarely performed. Hence, it is desirable to create an alternative way of indexing while preserving fast search times.

Results: To solve the index construction and update problem we propose the DREAM (Dynamic seaRchablE pArallel coMpressed index) framework and provide an implementation. The main contributions are the introduction of an approximate search distributor via a novel use of Bloom filters. We combine several Bloom filters to form an interleaved Bloom filter and use this new data structure to quickly exclude reads for parts of the databases where they cannot match. This allows us to keep the databases in several indices which can be easily rebuilt if parts are updated while maintaining a fast search time. The second main contribution is an implementation of DREAM-Yara a distributed version of a fully sensitive read mapper under the DREAM framework.

Details

show
hide
Language(s): eng - English
 Dates: 2018-09-08
 Publication Status: Issued
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: DOI: 10.1093/bioinformatics/bty567
PMID: 30423080
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show
hide
Title: Bioinformatics
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: Oxford : Oxford University Press
Pages: - Volume / Issue: 34 (17) Sequence Number: - Start / End Page: i766 - 1772 Identifier: ISSN: 1367-4803
CoNE: https://pure.mpg.de/cone/journals/resource/954926969991