Learning interpretable SVMs for biological sequence classification

Sonnenburg, S; Rätsch, G; Schäfer, C

doi:10.1007/11415770_30

Learning interpretable SVMs for biological sequence classification

Sonnenburg, S., Rätsch, G., & Schäfer, C. (2005). Learning interpretable SVMs for biological sequence classification. In S., Miyano, J., Mesirov, S., Kasif, S., Istrail, P., Pevzner, & M., Waterman (Eds.), Research in Computational Molecular Biology: 9th Annual International Conference, RECOMB 2005, Cambridge, MA, USA, May 14-18, 2005 (pp. 389-407). Berlin, Germany: Springer.

Item is 公開

表示: 全項目非表示: 全項目

基本情報

表示: 非表示:

アイテムのパーマリンク: https://hdl.handle.net/21.11116/0000-000B-2354-F 版のパーマリンク: https://hdl.handle.net/21.11116/0000-000B-2356-D

資料種別: 会議論文

ファイル

表示: ファイル

作成者

表示:

非表示:

作成者:
Sonnenburg, S, 著者
Rätsch, G¹, 著者
Schäfer, C, 著者

所属:
1Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society, ou_3378052

内容説明

表示:

非表示:

キーワード: -

要旨: We propose novel algorithms for solving the so-called Support Vector Multiple Kernel Learning problem and show how they can be used to understand the resulting support vector decision function. While classical kernel-based algorithms (such as SVMs) are based on a single kernel, in Multiple Kernel Learning a quadratically-constraint quadratic program is solved in order to find a sparse convex combination of a set of support vector kernels. We show how this problem can be cast into a semi-infinite linear optimization problem which can in turn be solved efficiently using a boosting-like iterative method in combination with standard SVM optimization algorithms. The proposed method is able to deal with thousands of examples while combining hundreds of kernels within reasonable time.

In the second part we show how this technique can be used to understand the obtained decision function in order to extract biologically relevant knowledge about the sequence analysis problem at hand. We consider the problem of splice site identification and combine string kernels at different sequence positions and with various substring (oligomer) lengths. The proposed algorithm computes a sparse weighting over the length and the substring, highlighting which substrings are important for discrimination. Finally, we propose a bootstrap scheme in order to reliably identify a few statistically significant positions, which can then be used for further analysis such as consensus finding.

資料詳細

表示:

非表示:

言語:

日付: 出版: 2005

出版の状態: 出版

ページ: -

出版情報: -

目次: -

査読: -

識別子（DOI, ISBNなど）: DOI: 10.1007/11415770_30

学位: -

訴訟

表示:

Project information

表示:

出版物 1

表示:

非表示:

出版物名: Research in Computational Molecular Biology: 9th Annual International Conference, RECOMB 2005, Cambridge, MA, USA, May 14-18, 2005

種別: 会議論文集

著者・編者:
Miyano, S, 編集者
Mesirov, J, 編集者
Kasif, S, 編集者
Istrail, S, 編集者
Pevzner, PA, 編集者
Waterman, M, 編集者

所属:
-

出版社, 出版地: Berlin, Germany : Springer

ページ: 632 巻号: - 通巻号: - 開始・終了ページ: 389 - 407 識別子（ISBN, ISSN, DOIなど）: ISBN: 978-3-540-25866-7
DOI: 10.1007/b135594

出版物 2

表示:

非表示:

出版物名: Lecture Notes in Computer Science

種別: 連載記事

著者・編者:

所属:

出版社, 出版地: -

ページ: - 巻号: 3500 通巻号: - 開始・終了ページ: - 識別子（ISBN, ISSN, DOIなど）: -

アイテム詳細

基本情報

ファイル

関連URL

作成者

内容説明

資料詳細

関連イベント

訴訟

Project information

出版物 1

出版物 2