Benutzerhandbuch Datenschutzhinweis Impressum Kontakt





Support vector machines for protein fold class prediction


Markowetz,  Florian
Max Planck Society;


Vingron,  Martin
Gene regulation (Martin Vingron), Dept. of Computational Molecular Biology (Head: Martin Vingron), Max Planck Institute for Molecular Genetics, Max Planck Society;

Externe Ressourcen
Es sind keine Externen Ressourcen verfügbar
Volltexte (frei zugänglich)
Es sind keine frei zugänglichen Volltexte verfügbar
Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Markowetz, F., Edler, L., & Vingron, M. (2003). Support vector machines for protein fold class prediction. Biometrical Journal, 45(3), 377-389. doi:10.1002/bimj.200390019.

Knowledge of the three-dimensional structure of a protein is essential for describing and understanding its function. Today, a large number of known protein sequences faces a small number of identified structures. Thus, the need arises to predict structure from sequence without using time-consuming experimental identification. In this paper the performance of Support Vector Machines (SVMs) is compared to Neural Networks and to standard statistical classification methods as Discriminant Analysis and Nearest Neighbor Classification. We show that SVMs can beat the competing methods on a dataset of 268 protein sequences to be classified into a set of 42 fold classes. We discuss misclassification with respect to biological function and similarity. In a second step we examine the performance of SVMs if the embedding is varied from frequencies of single amino acids to frequencies of tripletts of amino acids. This work shows that SVMs provide a promising alternative to standard statistical classification and prediction methods in functional genomics.