English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  Confidence measures for protein fold recognition

Sommer, I., Zien, A., von Ohsen, N., Zimmer, R., & Lengauer, T. (2002). Confidence measures for protein fold recognition. Bioinformatics, 18(6), 802-812. doi:10.1093/bioinformatics/18.6.802.

Item is

Files

show Files

Locators

show
hide
Description:
-
OA-Status:

Creators

show
hide
 Creators:
Sommer, I, Author
Zien, A1, Author           
von Ohsen , N, Author
Zimmer, R, Author
Lengauer, T, Author
Affiliations:
1External Organizations, ou_persistent22              

Content

show
hide
Free keywords: -
 Abstract: Motivation: We present an extensive evaluation of different methods and criteria to detect remote homologs of a given protein sequence. We investigate two associated problems: first, to develop a sensitive searching method to identify possible candidates and, second, to assign a confidence to the putative candidates in order to select the best one.

For searching methods where the score distributions are known, p-values are used as confidence measure with great success. For the cases where such theoretical backing is absent, we propose empirical approximations to p-values for searching procedures.

Results: As a baseline, we review the performances of different methods for detecting remote protein folds (sequence alignment and threading, with and without sequence profiles, global and local). The analysis is performed on a large representative set of protein structures.

For fold recognition, we find that methods using sequence profiles generally perform better than methods using plain sequences, and that threading methods perform better than sequence alignment methods.

In order to assess the quality of the predictions made, we establish and compare several confidence measures, including raw scores, z-scores, raw score gaps, z-score gaps, and different methods of p-value estimation. We work our way from the theoretically well backed local scores towards more explorative global and threading scores.

The methods for assessing the statistical significance of predictions are compared using specificity--sensitivity plots. For local alignment techniques we find that p-value methods work best, albeit computationally cheaper methods such as those based on score gaps achieve similar performance. For global methods where no theory is available methods based on score gaps work best.

By using the score gap functions as the measure of confidence we improve the more powerful fold recognition methods for which p-values are unavailable.
Availability: The benchmark set is available upon request.

Details

show
hide
Language(s):
 Dates: 2002-06
 Publication Status: Issued
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: BibTex Citekey: 2499
DOI: 10.1093/bioinformatics/18.6.802
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show
hide
Title: Bioinformatics
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: -
Pages: - Volume / Issue: 18 (6) Sequence Number: - Start / End Page: 802 - 812 Identifier: -