English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Accurate indel prediction using paired-end short reads

Grimm, D., Hagmann, J., Koenig, D., Weigel, D., & Borgwardt, K. (2013). Accurate indel prediction using paired-end short reads. BMC Genomics, 14: 132. doi:10.1186/1471-2164-14-132.

Item is

Files

show Files

Locators

show

Creators

show
hide
 Creators:
Grimm, D1, Author           
Hagmann, J1, Author           
Koenig, D1, Author           
Weigel, D1, Author           
Borgwardt, KM1, Author           
Affiliations:
1Department Molecular Biology, Max Planck Institute for Developmental Biology, Max Planck Society, ou_3375790              

Content

show
hide
Free keywords: -
 Abstract:

Background: One of the major open challenges in next generation sequencing (NGS) is the accurate identification of structural variants such as insertions and deletions (indels). Current methods for indel calling assign scores to different types of evidence or counter-evidence for the presence of an indel, such as the number of split read alignments spanning the boundaries of a deletion candidate or reads that map within a putative deletion. Candidates with a score above a manually defined threshold are then predicted to be true indels. As a consequence, structural variants detected in this manner contain many false positives.

Results: Here, we present a machine learning based method which is able to discover and distinguish true from false indel candidates in order to reduce the false positive rate. Our method identifies indel candidates using a discriminative classifier based on features of split read alignment profiles and trained on true and false indel candidates that were validated by Sanger sequencing. We demonstrate the usefulness of our method with paired-end Illumina reads from 80 genomes of the first phase of the 1001 Genomes Project ( http://www.1001genomes.org) in Arabidopsis thaliana.

Conclusion: In this work we show that indel classification is a necessary step to reduce the number of false positive candidates. We demonstrate that missing classification may lead to spurious biological interpretations. The software is available at: http://agkb.is.tuebingen.mpg.de/Forschung/SV-M/.

Details

show
hide
Language(s): eng - English
 Dates: 2013-02
 Publication Status: Published online
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: DOI: 10.1186/1471-2164-14-132
PMID: 2344237
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show
hide
Title: BMC Genomics
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: BioMed Central
Pages: 10 Volume / Issue: 14 Sequence Number: 132 Start / End Page: - Identifier: ISSN: 1471-2164
CoNE: https://pure.mpg.de/cone/journals/resource/111000136905010