English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Thesis

Using interpretable machine learning to understand gene silencing dynamics during X-chromosome inactivation

MPS-Authors
/persons/resource/persons73769

Barros de Andrade e Sousa,  Lisa
Regulatory Networks in Stem Cells (Edda G. Schulz), Independent Junior Research Groups (OWL), Max Planck Institute for Molecular Genetics, Max Planck Society;
Fachbereich Mathematik und Informatik der Freien Universität Berlin;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Barros de Andrade e Sousa, L. (2020). Using interpretable machine learning to understand gene silencing dynamics during X-chromosome inactivation. PhD Thesis. doi:10.17169/refubium-28944.


Cite as: https://hdl.handle.net/21.11116/0000-0008-8995-5
Abstract
To equalize gene dosage between sexes, the long non-coding RNA Xist mediates chromosome-wide gene silencing of one X Chromosome in female mammals - a process known as X chromosome inactivation (XCI). The efficiency of gene silencing is highly variable across genes, with some genes even escaping XCI in somatic cells. A gene’s susceptibility to Xist-mediated silencing appears to be determined by a complex interplay of epigenetic and genomic features. However, the underlying rules remain poorly understood. To advance the understanding of Xist-mediated silencing pathways, chromosome-wide gene silencing dynamics at the level of nascent transcriptome were quantified using allele-specific Precision nuclear Run-On sequencing. We have developed a Random Forest machine learning model that is able to predict the measured silencing dynamics based on a large set of epigenetic and genomic features and tested its predictive power experimentally. We introduced a forest-guided clustering approach to uncover the combinatorial rules that control Xist-mediated gene silencing. Results suggest that the genomic distance to the Xist locus, followed by gene density and distance to LINE elements are the prime determinants of silencing velocity. Moreover, a series of features associated with active transcriptional elongation and chromatin 3D structure are enriched at efficiently silenced genes. Generally, silenced genes seem to be separated into two distinct groups, associated with different silencing pathways: one group that requires an AT-rich sequence context and the Xist repeat-A for silencing, which is known to activate the SPEN pathway, and another group where genes are pre-marked by polycomb complexes and tend to rely on the repeat-B in Xist for silencing, known to recruit polycomb complexes during XCI. Our machine learning approach can thus uncover the complex combinatorial rules underlying gene silencing during X chromosome inactivation.