English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Bayesian Markov models improve the prediction of binding motifs beyond first order

Ge, W., Meier, M., Roth, C., & Söding, J. (2021). Bayesian Markov models improve the prediction of binding motifs beyond first order. NAR: Genomics and Bioinformatics, 3(2): lquab026. doi:10.1093/nargab/lqab026.

Item is

Basic

show hide
Genre: Journal Article

Files

show Files
hide Files
:
3346732.pdf (Publisher version), 3MB
Name:
3346732.pdf
Description:
-
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
-
Copyright Info:
-

Locators

show

Creators

show
hide
 Creators:
Ge, W.1, Author              
Meier, M.1, Author              
Roth, C.1, Author              
Söding, J.1, Author              
Affiliations:
1Research Group of Computational Biology, MPI for Biophysical Chemistry, Max Planck Society, ou_1933286              

Content

show
hide
Free keywords: -
 Abstract: Transcription factors (TFs) regulate gene expression by binding to specific DNA motifs. Accurate models for predicting binding affinities are crucial for quantitatively understanding of transcriptional regulation. Motifs are commonly described by position weight matrices, which assume that each position contributes independently to the binding energy. Models that can learn dependencies between positions, for instance, induced by DNA structure preferences, have yielded markedly improved predictions for most TFs on in vivo data. However, they are more prone to overfit the data and to learn patterns merely correlated with rather than directly involved in TF binding. We present an improved, faster version of our Bayesian Markov model software, BaMMmotif2. We tested it with state-of-the-art motif discovery tools on a large collection of ChIP-seq and HT-SELEX datasets. BaMMmotif2 models of fifth-order achieved a median false-discovery-rate-averaged recall 13.6% and 12.2% higher than the next best tool on 427 ChIP-seq datasets and 164 HT-SELEX datasets, respectively, while being 8 to 1000 times faster. BaMMmotif2 models showed no signs of overtraining in cross-cell line and cross-platform tests, with similar improvements on the next-best tool. These results demonstrate that dependencies beyond first order clearly improve binding models for most TFs.

Details

show
hide
Language(s): eng - English
 Dates: 2021-04-20
 Publication Status: Published online
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: Peer
 Identifiers: DOI: 10.1093/nargab/lqab026
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show
hide
Title: NAR: Genomics and Bioinformatics
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: -
Pages: - Volume / Issue: 3 (2) Sequence Number: lquab026 Start / End Page: - Identifier: -