Support Vector Machines and Kernels for Computational Biology

Ben-Hur, A; Ong, CS; Sonnenburg, S; Schölkopf, B; Rätsch, G

doi:10.1371/journal.pcbi.1000173

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Zeitschriftenartikel

Support Vector Machines and Kernels for Computational Biology

MPG-Autoren

/persons/resource/persons84118

Ong, CS
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;
Friedrich Miescher Laboratory, Max Planck Society;

/persons/resource/persons84960

Sonnenburg, S
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons84193

Schölkopf, B
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons84153

Rätsch, G
Friedrich Miescher Laboratory, Max Planck Society;

Externe Ressourcen

https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1000173&type=printable
(Verlagsversion)

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

Es sind keine frei zugänglichen Volltexte in PuRe verfügbar

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Ben-Hur, A., Ong, C., Sonnenburg, S., Schölkopf, B., & Rätsch, G. (2008). Support Vector Machines and Kernels for Computational Biology. PLoS Computational Biology, 4(10), 1-10. doi:10.1371/journal.pcbi.1000173.

Zitierlink: https://hdl.handle.net/11858/00-001M-0000-0013-C6B3-5

Zusammenfassung

The increasing wealth of biological data coming from a large variety of platforms and the continued development of new high-throughput methods for probing biological systems require increasingly more sophisticated computational approaches. Putting all these data in simple-to-use databases is a first step; but realizing the full potential of the data requires algorithms that automatically extract regularities from the data, which can then lead to biological insight.

Many of the problems in computational biology are in the form of prediction: starting from prediction of a gene's structure, prediction of its function, interactions, and role in disease. Support vector machines (SVMs) and related kernel methods are extremely good at solving such problems [1]–[3]. SVMs are widely used in computational biology due to their high accuracy, their ability to deal with high-dimensional and large datasets, and their flexibility in modeling diverse sources of data [2], [4]–[6].

The simplest form of a prediction problem is binary classification: trying to discriminate between objects that belong to one of two categories—positive (+1) or negative (−1). SVMs use two key concepts to solve this problem: large margin separation and kernel functions. The idea of large margin separation can be motivated by classification of points in two dimensions (see Figure 1). A simple way to classify the points is to draw a straight line and call points lying on one side positive and on the other side negative. If the two sets are well separated, one would intuitively draw the separating line such that it is as far as possible away from the points in both sets (see Figures 2 and 3). This intuitive choice captures the idea of large margin separation, which is mathematically formulated in the section Classification with Large Margin.