English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  Beware of circularity: A critical assessment of the state of the art in deleteriousness prediction of missense variants

Azencott, C., Grimm, D., Smoller, J., Duncan, L., & Borgwardt, K. (2014). Beware of circularity: A critical assessment of the state of the art in deleteriousness prediction of missense variants. In 64th Annual Meeting of the American Society of Human Genetics (ASHG 2014) (pp. 56).

Item is

Basic

show hide
Genre: Meeting Abstract

Files

show Files

Locators

show
hide
Description:
-
OA-Status:
Not specified

Creators

show
hide
 Creators:
Azencott, CA, Author
Grimm, D1, Author           
Smoller, JW, Author
Duncan, L, Author
Borgwardt, K1, Author           
Affiliations:
1Department Molecular Biology, Max Planck Institute for Developmental Biology, Max Planck Society, ou_3375790              

Content

show
hide
Free keywords: -
 Abstract: Discrimination between disease-causing missense mutations and neutral
polymorphisms is a key challenge in current sequencing studies. It is there-
fore critical to be able to evaluate fairly and without bias the performance
of the many in silico predictors of deleteriousness. However, current analy-
ses of such tools and their combinations are liable to suffer from the effects
of circularity, which occurs when predictors are evaluated on data that are
not independent from those that were used to build them, and may lead to
overly optimistic results. Circularity can first stem from the overlap between
training and evaluation datasets, which may result in the well-studied phe-
nomenon of overfitting: a tool that is too tailored to a given dataset will be
more likely than others to perform well on that set, but incurs the risk of
failing more heavily at classifying novel variants. Second, we find that circu-
larity may result from an investigation bias in the way mutation databases
are populated: in most cases, all the variants of the same protein are anno-
tated with the same (neutral or pathogenic) status. Furthermore, proteins
containing only deleterious SNVs comprise many more labeled variants
than their counterparts containing only neutral SNVs. Ignoring this, we find
that assigning a variant the same status as that of its closest variant on
the genomic sequence outperforms all state-of-the-art tools. Given these
barriers to valid assessment of the performance of deleteriousness predic-
tion tools, we employ approaches that avoid circularity, and hence provide
independent evaluation of ten state-of-the-art tools and their combinations.
Our detailed analysis provides scientists with critical insights to guide their
choice of tool as well as the future development of new methods for deleter-
iousness prediction. In particular, we demonstrate that the performance of
FatHMM-W relies mostly on the knowledge of the labels of neighboring
variants, which may hinder its ability to annotate variants in the less explored
regions of the genome. We also find that PolyPhen2 performs as well or
better than all other tools at discriminating between cases and controls in
a novel autism-relevant dataset. Based on our findings about the mutation
databases available for training deleteriousness prediction tools, we predict
that retraining PolyPhen2 features on the Varibench dataset will yield even
better performance, and we show that this is true for the autism-relevant dataset.

Details

show
hide
Language(s):
 Dates: 2014-10
 Publication Status: Published online
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: -
 Degree: -

Event

show
hide
Title: 64th Annual Meeting of the American Society of Human Genetics (ASHG 2014)
Place of Event: San Diego, CA, USA
Start-/End Date: 2014-10-18 - 2014-10-24

Legal Case

show

Project information

show

Source 1

show
hide
Title: 64th Annual Meeting of the American Society of Human Genetics (ASHG 2014)
Source Genre: Proceedings
 Creator(s):
Affiliations:
Publ. Info: -
Pages: - Volume / Issue: - Sequence Number: 167 Start / End Page: 56 Identifier: -