Next-generation sequencing algorithms : from read mapping to variant detection

Emde, Anne-Katrin

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Thesis

Next-generation sequencing algorithms : from read mapping to variant detection

MPS-Authors

/persons/resource/persons50144

Emde, Anne-Katrin
Gene Structure and Array Design (Stefan Haas), Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;
Freie Universität Berlin, Fachbereich Mathematik und Informatik;

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

thesis_Emde.pdf
(Any fulltext), 3MB

Supplementary Material (public)

There is no public supplementary material available

Citation

Emde, A.-K. (2013). Next-generation sequencing algorithms: from read mapping to variant detection. PhD Thesis, Berlin, Freie Universität.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0018-D093-A

Abstract

Next-Generation-Sequencing (NGS) has brought on a revolution in sequence analysis with its broad spectrum of applications ranging from genome resequencing to transcriptomics or metage- nomics, and from fundamental research to diagnostics. The tremendous amounts of data necessi- tate highly ecient computational analysis tools for the wide variety of NGS applications. This thesis addresses a broad range of key computational aspects of resequencing applications, where a reference genome sequence is known and heavily used for interpretation of the newly sequenced sample. It presents tools for read mapping and benchmarking, for partial read mapping of small RNA reads and for structural variant/indel detection, and nally tools for detecting and genotyping SNVs and short indels. Our tools eciently scale to large NGS data sets and are well- suited for advances in sequencing technology, since their generic algorithm design allows handling of arbitrary read lengths and variable error rates. Furthermore, they are implemented within the robust C++ library SeqAn, making them open-source, easily available, and potentially adaptable for the bioinformatics community. Among other applications, our tools have been integrated into a large-scale analysis pipeline and have been applied to large datasets, leading to interesting discoveries of human retrocopy variants and insights into the genetic causes of X-linked intellectual disabilities.