Robust Normalization of Next Generation Sequencing Data

Helmuth, Johannes

doi:10.17169/refubium-6942

Local TagsRelease HistoryDetailsSummary

Robust Normalization of Next Generation Sequencing Data

Helmuth, J. (2017). Robust Normalization of Next Generation Sequencing Data. PhD Thesis. doi:10.17169/refubium-6942.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/21.11116/0000-0000-82A5-2 Version Permalink: https://hdl.handle.net/21.11116/0000-000F-13E2-C

Genre: Thesis

Files

show Files

Locators

show

Creators

show

hide

Creators:
Helmuth, Johannes^{1, 2}, Author
Vingron, Martin³, Referee

Affiliations:
1Computational Epigenetics (Ho-Ryun Chung), Independent Junior Research Groups (OWL), Max Planck Institute for Molecular Genetics, Max Planck Society, ou_1479658
2Fachbereich Mathematik und Informatik der Freien Universität Berlin, ou_persistent22
3Transcriptional Regulation (Martin Vingron), Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society, ou_1479639

Content

show

hide

Free keywords: Normalization of read count data; Enrichment Calling; Difference Calling; ChIP-seq; RNA-seq; ATAC-seq; STARR-seq

Abstract: Molecular Biology pertains to the molecular basis of the regulation of biomolecular processes in the cell, e.g. gene expression or the genome-wide localization of DNA-associated proteins. These molecular quantities are routinely measured by Next Generation Sequencing (NGS)-based tech- niques due to their genome-wide scalability and cost-efficiency. In order to discern background- regions from genomic loci that harbor a biological relevant signal, i.e. difference calling, the NGS measurements need to be corrected for technical biases with the help of a control, i.e. nor- malization. However, the normalization itself requires the knowledge of background regions and, consequently, difference calling and normalization are inseparable. Here, this problem is solved by the data-driven “normR” framework which models the inter- dependency of NGS mea- surements in background- and signal-regions as a multinomial sampling trial with a binomial mixture model. The robust normR normalization accounts for the effect of signal on the overall measurement statistic by modeling treatment and control simultaneously. In this thesis, I used normR in three studies concerning the inference of DNA-protein binding from ChIP-seq data. Firstly, the two-component “enrichR” model is shown to achieve a more sensitive enrichment calling (AUC≥0.93) than six competitor methods (AUC≤0.86) in low, e.g. H3K36me3, and high, e.g. H3K4me3, signal-to- noise ratio (S/N) ChIP-seq data. enrichR’s enrichment calls augment the resolution and comprehensiveness of chromatin segmentations by chromHMM and its normal- ization improves on present in silico and in vitro ChIP-seq normalization methods. Secondly, the three-component “regimeR” model dissects enrichment into two unprecedented regimes of dif- ferent signal levels. A regimeR-based analysis identified two distinct facultative and constitutive heterochromatic enrichment regimes in H3K27me3 and H3K9me3 ChIP-seq data, respectively. The identified peak regions (high enrichment) resemble nucleation sites for heterochromatin embedded in regions of broad (low) enrichment. Lastly, the three-component “diffR” model calls conditional differences in ChIP-seq enrichment between two conditions. The diffR calls in low (H3K27me3) and high (H3K4me3) S/N ChIP-seq data are confirmed by a systematic compari- son to four difference callers. Overall, normR represents a robust and versatile framework for the comprehensive analysis of ChIP-seq data, yet, it can be readily applied to other NGS-based experiments like ATAC- seq, STARR-seq or RNA-seq.

Details

show

hide

Language(s): eng - English

Dates: Accepted: 2017Published Online: 2017-06-28

Publication Status: Published online

Pages: xii, 145 S.

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: DOI: 10.17169/refubium-6942
URI: https://refubium.fu-berlin.de/handle/fub188/2741

Degree: PhD

Event

show

Legal Case

show

Project information

show

Source

show