English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Integration of multi-omics data with graph convolutional networks to identify cancer-associated genes

Schulte-Sasse, R. (2020). Integration of multi-omics data with graph convolutional networks to identify cancer-associated genes. PhD Thesis. doi:10.17169/refubium-31047.

Item is

Files

show Files

Locators

show

Creators

show
hide
 Creators:
Schulte-Sasse, Roman1, 2, Author           
Marsico, Annalisa3, Referee           
Affiliations:
1Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society, ou_1433547              
2Fachbereich Mathematik und Informatik der Freien Universität Berlin, ou_persistent22              
3RNA Bioinformatics (Annalisa Marsico), Independent Junior Research Groups (OWL), Max Planck Institute for Molecular Genetics, Max Planck Society, ou_2117285              

Content

show
hide
Free keywords: cancer machine learning graph convolutional networks bioinformatics deep learning
 Abstract: Cancer is thought to arise from the accumulation of genetic changes in the DNA of the patient. Mutations can occur during replication of cells or from external factors. Given the current knowledge of gene regulation it is not yet possible to link cancer phenotypes directly to the genetic alterations. Despite the vast increase of available high-throughput molecular data, the in silico identification of disease genes for multi-factorial diseases such as cancer is still a challenging task. Perturbation of entire modules in cellular networks, and genetic, as well as non-genetic gene alternations, contribute to tumorigenesis. This necessitates the development of predictive models able to effectively integrate and process different data modalities. Most approaches cannot combine multi-dimensional molecular data with gene-gene interactions and the few methods that achieve that are hard to interpret. In this thesis, I introduce EMOGI, an explainable machine learning method based on Graph Convolutional Networks (GCNs) to predict cancer genes by combining multi-omics data, such as mutations, copy number changes, DNA methylation and gene expression profiles across different cancers, together with Protein-Protein Interaction (PPI) networks. By profiting from different data representations, EMOGI was more accurate than previous methods in predicting known cancer genes, with an average increase in area under the precision-recall curve of 3% – 37% across different PPI networks and data sets. We applied the Layer-Wise Relevance Propagation (LRP) technique to learn the molecular features that contributed to the classification of each individual cancer gene. We also identified relevant cancer modules in the PPI network, and stratified genes according to whether their classification was mainly driven by the interactome, mutation rate or alterations in either DNA methylation or gene expression. We propose a new high-confidence list of 165 putative novel cancer genes which do not harbour recurrent alterations, but rather participate in PPIs with well-known cancer drivers. We functionally validated those novel predictions with publicly available loss-of-function screens. We believe that our results might open new diagnostic and therapeutic avenues in precision oncology, and that our method can applied to predict biomarkers for other complex diseases.

Details

show
hide
Language(s): eng - English
 Dates: 20202021-09-30
 Publication Status: Published online
 Pages: ix, 190 S.
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Degree: PhD

Event

show

Legal Case

show

Project information

show

Source

show