Help Privacy Policy Disclaimer
  Advanced SearchBrowse





Next-generation sequencing algorithms : from read mapping to variant detection


Emde,  Anne-Katrin
Gene Structure and Array Design (Stefan Haas), Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;
Freie Universität Berlin, Fachbereich Mathematik und Informatik;

External Resource
No external resources are shared
Fulltext (public)

(Any fulltext), 3MB

Supplementary Material (public)
There is no public supplementary material available

Emde, A.-K. (2013). Next-generation sequencing algorithms: from read mapping to variant detection. PhD Thesis, Berlin, Freie Universität.

Cite as: http://hdl.handle.net/11858/00-001M-0000-0018-D093-A
Next-Generation-Sequencing (NGS) has brought on a revolution in sequence analysis with its broad spectrum of applications ranging from genome resequencing to transcriptomics or metage- nomics, and from fundamental research to diagnostics. The tremendous amounts of data necessi- tate highly ecient computational analysis tools for the wide variety of NGS applications. This thesis addresses a broad range of key computational aspects of resequencing applications, where a reference genome sequence is known and heavily used for interpretation of the newly sequenced sample. It presents tools for read mapping and benchmarking, for partial read mapping of small RNA reads and for structural variant/indel detection, and nally tools for detecting and genotyping SNVs and short indels. Our tools eciently scale to large NGS data sets and are well- suited for advances in sequencing technology, since their generic algorithm design allows handling of arbitrary read lengths and variable error rates. Furthermore, they are implemented within the robust C++ library SeqAn, making them open-source, easily available, and potentially adaptable for the bioinformatics community. Among other applications, our tools have been integrated into a large-scale analysis pipeline and have been applied to large datasets, leading to interesting discoveries of human retrocopy variants and insights into the genetic causes of X-linked intellectual disabilities.