English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Journal Article

Prediction of protein functional residues from sequence by probability density estimation

MPS-Authors
/persons/resource/persons277778

Fischer,  JD
Department Protein Evolution, Max Planck Institute for Developmental Biology, Max Planck Society;

/persons/resource/persons232902

Mayer,  CE
Department Protein Evolution, Max Planck Institute for Developmental Biology, Max Planck Society;

/persons/resource/persons128572

Söding,  J
Department Protein Evolution, Max Planck Institute for Developmental Biology, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Fischer, J., Mayer, C., & Söding, J. (2008). Prediction of protein functional residues from sequence by probability density estimation. Bioinformatics, 24(5), 613-620. doi:10.1093/bioinformatics/btm626.


Cite as: https://hdl.handle.net/21.11116/0000-000A-F2AA-5
Abstract
Motivation: The prediction of ligand-binding residues or catalytically active residues of a protein may give important hints that can guide further genetic or biochemical studies. Existing sequence-based prediction methods mostly rank residue positions by evolutionary conservation calculated from a multiple sequence alignment of homologs. A problem hampering more wide-spread application of these methods is the low per-residue precision, which at 20% sensitivity is around 35% for ligand-binding residues and 20% for catalytic residues.
Results: We combine information from the conservation at each site, its amino acid distribution, as well as its predicted secondary structure (ss) and relative solvent accessibility (rsa). First, we measure conservation by how much the amino acid distribution at each site differs from the distribution expected for the predicted ss and rsa states. Second, we include the conservation of neighboring residues in a weighted linear score by analytically optimizing the signal-to-noise ratio of the total score. Third, we use conditional probability density estimation to calculate the probability of each site to be functional given its conservation, the observed amino acid distribution, and the predicted ss and rsa states. We have constructed two large data sets, one based on the Catalytic Site Atlas and the other on PDB SITE records, to benchmark methods for predicting functional residues. The new method FRcons predicts ligand-binding and catalytic residues with higher precision than alternative methods over the entire sensitivity range, reaching 50% and 40% precision at 20% sensitivity, respectively.