Help Privacy Policy Disclaimer
  Advanced SearchBrowse




Journal Article

Support vector machines for protein fold class prediction


Markowetz,  Florian
Max Planck Society;


Vingron,  Martin
Gene regulation (Martin Vingron), Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;

External Resource
No external resources are shared
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available

Markowetz, F., Edler, L., & Vingron, M. (2003). Support vector machines for protein fold class prediction. Biometrical Journal, 45(3), 377-389. doi:10.1002/bimj.200390019.

Cite as: http://hdl.handle.net/11858/00-001M-0000-0010-8B17-5
Knowledge of the three-dimensional structure of a protein is essential for describing and understanding its function. Today, a large number of known protein sequences faces a small number of identified structures. Thus, the need arises to predict structure from sequence without using time-consuming experimental identification. In this paper the performance of Support Vector Machines (SVMs) is compared to Neural Networks and to standard statistical classification methods as Discriminant Analysis and Nearest Neighbor Classification. We show that SVMs can beat the competing methods on a dataset of 268 protein sequences to be classified into a set of 42 fold classes. We discuss misclassification with respect to biological function and similarity. In a second step we examine the performance of SVMs if the embedding is varied from frequencies of single amino acids to frequencies of tripletts of amino acids. This work shows that SVMs provide a promising alternative to standard statistical classification and prediction methods in functional genomics.