English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Journal Article

TADA – a Machine Learning Tool for Functional Annotation based Prioritisation of Putative Pathogenic CNVs

MPS-Authors
/persons/resource/persons247268

Hertzberg,  J.
Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;

/persons/resource/persons50437

Mundlos,  S.
Research Group Development & Disease (Head: Stefan Mundlos), Max Planck Institute for Molecular Genetics, Max Planck Society;

/persons/resource/persons50613

Vingron,  M.
Gene regulation (Martin Vingron), Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;

/persons/resource/persons228518

Gallone,  G.
Gene regulation (Martin Vingron), Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)

Hertzberg_2020.pdf
(Preprint), 649KB

Supplementary Material (public)
There is no public supplementary material available
Citation

Hertzberg, J., Mundlos, S., Vingron, M., & Gallone, G. (2020). TADA – a Machine Learning Tool for Functional Annotation based Prioritisation of Putative Pathogenic CNVs. bioRxiv - The Preprint Server for Biology, 2020. doi:10.1101/2020.06.30.180711.


Cite as: https://hdl.handle.net/21.11116/0000-0007-B8F4-6
Abstract
The computational prediction of disease-associated genetic variation is of fundamental importance for the genomics, genetics and clinical research communities. Whereas the mechanisms and disease impact underlying coding single nucleotide polymorphisms (SNPs) and small Insertions/Deletions (InDels) have been the focus of intense study, little is known about the corresponding impact of structural variants (SVs), which are challenging to detect, phase and interpret. Few methods have been developed to prioritise larger chromosomal alterations such as Copy Number Variants (CNVs) based on their pathogenicity. We address this issue with TADA, a method to prioritise pathogenic CNVs through manual filtering and automated classification, based on an extensive catalogue of functional annotation supported by rigorous enrichment analysis. We demonstrate that our machine-learning classifiers for deletions and duplications are able to accurately predict pathogenic CNVs (AUC: 0.8042 and 0.7869, respectively) and produce a well-calibrated pathogenicity score. The combination of enrichment analysis and classifications suggests that prioritisation of pathogenic CNVs based on functional annotation is a promising approach to support clinical diagnostic and to further the understanding of mechanisms that control the disease impact of larger genomic alterations.