English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Journal Article

Optimized Cutting Plane Algorithm for Large-Scale Risk Minimization

MPS-Authors
/persons/resource/persons84960

Sonnenburg,  S
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Franc, V., & Sonnenburg, S. (2009). Optimized Cutting Plane Algorithm for Large-Scale Risk Minimization. The Journal of Machine Learning Research, 10, 2157-2192.


Cite as: https://hdl.handle.net/21.11116/0000-0010-4FD2-9
Abstract
We have developed an optimized cutting plane algorithm (OCA) for solving large-scale risk mini- mization problems. We prove that the number of iterations OCA requires to converge to a ε precise solution is approximately linear in the sample size. We also derive OCAS, an OCA-based linear bi- nary Support Vector Machine (SVM) solver, and OCAM, a linear multi-class SVM solver. In an ex- tensive empirical evaluation we show that OCAS outperforms current state-of-the-art SVM solvers like SVMlight , SVMperf and BMRM, achieving speedup factor more than 1,200 over SVMlight on some data sets and speedup factor of 29 over SVMperf , while obtaining the same precise sup- port vector solution. OCAS, even in the early optimization steps, often shows faster convergence than the currently prevailing approximative methods in this domain, SGD and Pegasos. In addi- tion, our proposed linear multi-class SVM solver, OCAM, achieves speedups of factor of up to 10 compared to SVMmulti−class . Finally, we use OCAS and OCAM in two real-world applications, the problem of human acceptor splice site detection and malware detection. Effectively paral- lelizing OCAS, we achieve state-of-the-art results on an acceptor splice site recognition problem only by being able to learn from all the available 50 million examples in a 12-million-dimensional feature space. Source code, data sets and scripts to reproduce the experiments are available at http://cmp.felk.cvut.cz/˜xfrancv/ocas/html/.