English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Meeting Abstract

KIRMES: kernel-based identification of regulatory modules in euchromatic sequences

MPS-Authors
/persons/resource/persons85314

Schultheiss,  SJ
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

/persons/resource/persons84153

Rätsch,  G
Rätsch Group, Friedrich Miescher Laboratory, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Schultheiss, S., Busch, W., Lohmann, J., Kohlbacher, O., & Rätsch, G. (2009). KIRMES: kernel-based identification of regulatory modules in euchromatic sequences. BMC Bioinformatics, 10(Supplement 13): O1.


Cite as: https://hdl.handle.net/21.11116/0000-000B-9B46-8
Abstract
Background: We predict transcription factor (TF) target genes based on their regulatory sequence. A TF binding site is a short segment (~10 bp) near a gene's regulatory region that is recognized by respective TFs. Overrepresented motifs can be identified in regulatory sequences of a set of genes that is enriched with targets for a specific TF. Gibbs-sampling methods that try to identify position weight matrices to characterize binding sites have been successful for small genomes, but are problematic in higher eukaryotes, where motifs are degenerate and form cis-regulatory modules [1].
Methods: Our method classifies genes as TF targets. We use de novo motif finding and subsequently apply a Support Vector Machine employing a kernel that captures information about the motifs, their relative location, and sequence conservation (see Figure 1). The weighted degree kernel with shifts (WDS) computes the similarity of fixed-length sequences. We extend this kernel with conservation information and information about motif co-occurrence to the Regulatory Modules kernel [2]. KIRMES is available on our Galaxy server http://galaxy.tuebingen.mpg.de. Using positional oligomer importance matrices [3], we are able to make the output of the kernel interpretable by displaying a sequence logo of the oligomers that contributed most to the correct classification.
Results: We compared our method to a state-of-the-art Gibbs sampler, PRIORITY [4], on its own dataset with the published settings with respect to successful classification. We achieve correct predictions on 74% of their sets vs. 63% for PRIORITY. We let KIRMES classify gene sets obtained from microarrays of Arabidopsis thaliana. Using conservation as weighting for the WDS kernel improves performance. These results illustrate the power of our approach in exploiting the relationship between motifs as well as conservation to improve the recognition of TF targets. Interpretable results and an easy-to-use web service make this a valuable tool for any researcher interested in gene regulation.