Deutsch
 
Benutzerhandbuch Datenschutzhinweis Impressum Kontakt
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT

Freigegeben

Hochschulschrift

Inferring Transcriptional Regulators Using Clustered Multi-Task Regression

MPG-Autoren
/persons/resource/persons185321

Heinen,  Tobias
International Max Planck Research School, MPI for Informatics, Max Planck Society;

/persons/resource/persons127666

Schulz,  Marcel Holger
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society;

/persons/resource/persons180775

Marschall,  Tobias
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society;

Externe Ressourcen
Es sind keine Externen Ressourcen verfügbar
Volltexte (frei zugänglich)
Es sind keine frei zugänglichen Volltexte verfügbar
Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar
Zitation

Heinen, T. (2018). Inferring Transcriptional Regulators Using Clustered Multi-Task Regression. Master Thesis, Universität des Saarlandes, Saarbrücken.


Zitierlink: http://hdl.handle.net/21.11116/0000-0002-B37A-B
Zusammenfassung
Sparse linear regression is often used to identify key transcriptional regulators by predicting gene expression abundance from regulatory features such as transcription factor (TF) binding or epigenomics data. However, a single linear model explaining the gene expression of thousands of genes is limited in capturing the complexity of cis-regulatory modules and gene co-expression patterns. Indeed, certain TFs are known to act as both activators or repressors depending on associated cofactors and neighbouring DNA-bound proteins. It is therefore desirable to identify clusters or modules of co-regulated genes and model their regulatory profiles separately. Finite mixtures of regression models are a popular tool for modeling hetero- geneous data, while maintaining a linearity assumption. Unfortunately, they do not take advantage of available data sets containing the molecular profiles of many biological samples. We propose to combine the power of mixture modeling and multi-task learning by using a penalized maximum likelihood framework for infer- ring gene modules and regulators in multiple samples simultaneously. More specif- ically, we regularize the likelihood function with a tree-structured L1/L2 penalty to enable knowledge transfer between models of related cells. We optimize the parameters of our models with a generalized EM algorithm. Experimental evalu- ation of our method on synthetic data suggests that multi-task mixture modelling is more suitable for identifying the true underlying cluster structure compared to a single-task regression mixture model. Finally, we apply the model to a dataset from the BLUEPRINT project consisting of various types of haematopoietic cells and uncover interesting regulatory patterns.