English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Journal Article

Sensitive inference of alignment-safe intervals from biodiverse protein sequence clusters using EMERALD

MPS-Authors
/persons/resource/persons271373

Buchfink,  B       
Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;
Computational Biology Group, Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;

/persons/resource/persons271796

Drost,  H-G       
Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;
Computational Biology Group, Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Grigorjew, A., Gynter, A., Dias, F., Buchfink, B., Drost, H.-G., & Tomescu, A. (2023). Sensitive inference of alignment-safe intervals from biodiverse protein sequence clusters using EMERALD. Genome Biology: Biology for the Post-Genomic Era, 24(1): 168. doi:10.1186/s13059-023-03008-6.


Cite as: https://hdl.handle.net/21.11116/0000-000C-3EE7-B
Abstract
Sequence alignments are the foundations of life science research, but most innovation so far focuses on optimal alignments, while information derived from suboptimal solutions is ignored. We argue that one optimal alignment per pairwise sequence comparison is a reasonable approximation when dealing with very similar sequences but is insufficient when exploring the biodiversity of the protein universe at tree-of-life scale. To overcome this limitation, we introduce pairwise alignment-safety to uncover the amino acid positions robustly shared across all suboptimal solutions. We implement EMERALD, a software library for alignment-safety inference, and apply it to 400k sequences from the SwissProt database.