Benutzerhandbuch Datenschutzhinweis Impressum Kontakt





Statistical analysis of high-throughput sequencing count data


Love,  Michael I.
IMPRS for Computational Biology and Scientific Computing - IMPRS-CBSC (Kirsten Kelleher), Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;
Freie Universität Berlin, Fachbereich Mathematik und Informatik;

Externe Ressourcen
Es sind keine Externen Ressourcen verfügbar
Volltexte (frei zugänglich)

(beliebiger Volltext), 2MB

Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Love, M. I. (2014). Statistical analysis of high-throughput sequencing count data. PhD Thesis.

All of the work presented in this thesis grew out of collaborations with other researchers. For each chapter, I brie y summarize my contribution and acknowledge the contributions of others. Chapter 2 represents a conceptual framework for modeling read counts using various distributions. These ideas grew out of conversations with Ho-Ryun Chung at the Max Planck Institute for Molecular Genetics (MPIMG) in Berlin and Simon Anders at the European Molecular Biology Laboratories (EMBL) in Heidelberg. Chapter 3 was published in Statistical Applications in Genetics and Molecular Biology [1]. The idea for detecting copy number variants in exome-enriched sequencing data was proposed by Stefan Haas and with Alena van Bommel various methods were tested and evaluated. My contribution was developing the hidden Markov model, implementing the software and testing the performance. I wish to acknowledge the X-linked intellectual disabilities project team at MPIMG including H.-Hilger Ropers, Vera Kalscheuer, Ruping Sun, Anne-Katrin Emde, Wei Chen, Hao Hu and Tomasz Zemojtel, who provided helpful discussions. Chapter 4 resulted from a 5 month visit to the group of Wolfgang Huber at EMBL in Heidelberg. Simon Anders proposed the idea of incorporating priors for dispersion and log fold change into the DESeq framework. My contribution was to implement these new statistical methods as a new package DESeq2, with closer integration with core Bioconductor packages. I would like to acknowledge all the members of the Huber group for helpful discussions. Chapter 5 resulted from a collaboration with the Transcriptional Regulation Group of Sebastiaan Meijsing at the MPIMG. I would like to thank Stephan Starick who initially proposed to investigate the interaction between glucocorticoid receptor and the chromatin landscape. My contribution was the statistical analysis presented in the chapter. Sebastiaan Meijsing provided valuable feedback during the evolution of the project. I wish to acknowledge the contributions of Morgane Thomas-Chollier, Katja Borzym, Sam Cooper and Ho-Ryun Chung.