Learning interpretable SVMs for biological sequence classification

Rätsch, G; Sonnenburg, S; Schäfer, C

doi:10.1186/1471-2105-7-S1-S9

Local TagsRelease HistoryDetailsSummary

Learning interpretable SVMs for biological sequence classification

Rätsch, G., Sonnenburg, S., & Schäfer, C. (2006). Learning interpretable SVMs for biological sequence classification. BMC Bioinformatics, 7(Supplement 1): S9.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/21.11116/0000-000A-B044-2 Version Permalink: https://hdl.handle.net/21.11116/0000-000A-B045-1

Genre: Conference Paper

Files

show Files

Locators

show

Creators

show

hide

Creators:
Rätsch, G¹, Author
Sonnenburg, S, Author
Schäfer, C, Author

Affiliations:
1Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society, ou_3378052

Content

show

hide

Free keywords: -

Abstract: Background: Support Vector Machines (SVMs)--using a variety of string kernels--have been successfully applied to biological sequence classification problems. While SVMs achieve high classification accuracy they lack interpretability. In many applications, it does not suffice that an algorithm just detects a biological signal in the sequence, but it should also provide means to interpret its solution in order to gain biological insight.

Results: We propose novel and efficient algorithms for solving the so-called Support Vector Multiple Kernel Learning problem. The developed techniques can be used to understand the obtained support vector decision function in order to extract biologically relevant knowledge about the sequence analysis problem at hand. We apply the proposed methods to the task of acceptor splice site prediction and to the problem of recognizing alternatively spliced exons. Our algorithms compute sparse weightings of substring locations, highlighting which parts of the sequence are important for discrimination.

Conclusion: The proposed method is able to deal with thousands of examples while combining hundreds of kernels within reasonable time, and reliably identifies a few statistically significant

Details

show

hide

Language(s):

Dates: Date issued: 2006-03

Publication Status: Issued

Pages: -

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: DOI: 10.1186/1471-2105-7-S1-S9
PMID: 16723012

Degree: -

Event

show

hide

Title: NIPS Workshop on New Problems and Methods in Computational Biology

Place of Event: Whistler, Canada

Start-/End Date: 2004-12-08

Legal Case

show

Project information

show

Source 1

show

hide

Title: BMC Bioinformatics

Source Genre: Journal

Creator(s):

Affiliations:

Publ. Info: BioMed Central

Pages: 14 Volume / Issue: 7 (Supplement 1) Sequence Number: S9 Start / End Page: - Identifier: ISSN: 1471-2105
CoNE: https://pure.mpg.de/cone/journals/resource/111000136905000