English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Report

An Automated Combination of Sequence Motif Kernels for Predicting Protein Subcellular Localization

MPS-Authors
/persons/resource/persons84331

Zien,  A
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;

/persons/resource/persons84118

Ong,  CS
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;
Max Planck Institute for Biological Cybernetics, Max Planck Society;
Friedrich Miescher Laboratory, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)

MPIK-TR-146.pdf
(Publisher version), 212KB

Supplementary Material (public)
There is no public supplementary material available
Citation

Zien, A., & Ong, C.(2006). An Automated Combination of Sequence Motif Kernels for Predicting Protein Subcellular Localization (146). Tübingen, Germany: Max Planck Institute for Biological Cybernetics.


Cite as: https://hdl.handle.net/11858/00-001M-0000-0013-D23D-1
Abstract
Protein subcellular localization is a crucial ingredient to many
important inferences about cellular processes, including prediction of
protein function and protein interactions. While many predictive
computational tools have been proposed, they tend to have complicated
architectures and require many design decisions from the developer.
We propose an elegant and fully automated approach to building a
prediction system for protein subcellular localization. We propose a
new class of protein sequence kernels which considers all motifs
including motifs with gaps. This class of kernels allows
the inclusion of pairwise amino acid distances into their
computation. We further propose a multiclass support vector machine method
which directly solves protein subcellular localization without
resorting to the common approach of splitting the problem into several
binary classification problems. To automatically search over families
of possible amino acid motifs, we generalize our method to optimize over
multiple kernels at the same time. We compare our automated approach
to four other predictors on three different datasets.