English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Statistical models to capture protein-RNA interaction footprints from truncation-based CLIP-seq data

Krakau, S. (2019). Statistical models to capture protein-RNA interaction footprints from truncation-based CLIP-seq data. PhD Thesis. doi:10.17169/refubium-26166.

Item is

Files

show Files

Locators

show

Creators

show
hide
 Creators:
Krakau, Sabrina1, 2, Author                 
Marsico, Annalisa3, Referee           
Affiliations:
1Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society, ou_1433547              
2Fachbereich Mathematik und Informatik der Freien Universität Berlin, ou_persistent22              
3RNA Bioinformatics (Annalisa Marsico), Independent Junior Research Groups (OWL), Max Planck Institute for Molecular Genetics, Max Planck Society, ou_2117285              

Content

show
hide
Free keywords: protein-rna interaction; iCLIP; eCLIP; CLIP-seq; Hidden Markov Model; HMM; Bioinformatics; Statistical Modelling
 Abstract: Protein-RNA interactions play an important role in all post-transcriptional regulatory processes. High throughput detection of protein-RNA interactions has been facilitated by the emerging CLIP-seq (crosslinking and immunoprecipitation combined with high-throughput sequencing) techniques. Enrichments in mapped reads as well as base transitions or deletions at crosslink sites can be used to infer binding regions. Single-nucleotide resolution techniques (iCLIP and eCLIP) have been achieved by capturing high fractions of cDNAs which are truncated at protein-RNA crosslink sites. Increasing numbers of datasets and derivatives of these protocols have been published in recent years, requiring tailored computational analyses. Existing methods unfortunately do not explicitly model the specifics of truncation patterns and possible biases caused by background binding or crosslinking sequence preferences. We present PureCLIP, a hidden Markov model based approach, which simultaneously performs peak calling and individual crosslink site detection. It is capable of incorporating external data to correct for non-specific background signals and, for the first time, for the crosslinking biases. We devised a comprehensive evaluation based on three strategies. Firstly, we developed a workflow to simulate iCLIP data, which starts from real RNA-seq data and known binding regions and then mimics the experimental steps of the iCLIP protocol, including the generation of background signals. Secondly, we used experimental iCLIP and eCLIP datasets, using the proteins’ known predominant binding regions. And thirdly, we assessed the agreement of called sites between replicates, assuming target-specific signals are reproducible between replicates. On both simulated and real data, PureCLIP is consistently more precise in calling crosslink sites than other state-of-the-art methods. In particular when incorporating input control data and crosslink associated motifs (CL-motifs) PureCLIP is up to 13% more precise than other methods and we show that it has an up to 20% higher agreement across replicates. Moreover, our method can optionally merge called crosslink sites to binding regions based on their distance and we show that the resulting regions reflect the known binding regions with high-resolution. Additionally, we demonstrate that our method achieves a high precision robustly over a range of different settings and performs well for proteins with different binding characteristics. Lastly, we extended the method to include individual CLIP replicates and show that this can boost the precision even further. PureCLIP and its documenta- tion are publicly available at https://github.com/skrakau/PureCLIP.

Details

show
hide
Language(s): eng - English
 Dates: 20192020-01-15
 Publication Status: Published online
 Pages: viii, 167 S
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Degree: PhD

Event

show

Legal Case

show

Project information

show

Source

show