English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Sensitive clustering of 20 billion protein sequences at tree-of-life scale using DIAMOND2 DeepClust

Drost, H.-G. (2023). Sensitive clustering of 20 billion protein sequences at tree-of-life scale using DIAMOND2 DeepClust. Talk presented at Max-Planck-Campus Tübingen: Distinguished Speaker Seminar Series (DSSS). Tübingen, Germany. 2023-06-02.

Item is

Files

show Files

Locators

show

Creators

hide
 Creators:
Drost, H-G1, 2, Author                 
Affiliations:
1Computational Biology Group, Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society, ou_3496867              
2Department Molecular Biology, Max Planck Institute for Biology Tübingen, Max Planck Society, Max-Planck-Ring 5, 72076 Tübingen, DE, ou_3371687              

Content

hide
Free keywords: -
 Abstract: Our understanding of the origin and natural variation of the global biosphere is largely derived from morphological insights with data collections reaching back to the time of Aristotle. Sequencing the genomes and annotating the protein sequences across the tree of life will transform our access to evolutionary information and may provide a roadmap to characterizing the molecular principles underlying biodiversification. The key to accessing this reservoir of genomic information for molecular exploration and functional annotation is the comparative method, usually enabled by sequence similarity assessments. We introduce DIAMOND2 DeepClust, a ultra-fast and sensitive sequence clustering method optimized to perform protein sequence similarity clustering at low identity levels (e.g. down to 20% identity). Using DIAMOND2 DeepClust, we present an experimental study based on clustering the protein universe currently comprising of ~20 billion protein sequences and show how to overcome computational bottlenecks in the biosphere genomics era.

Details

hide
Language(s):
 Dates: 2023-06
 Publication Status: Published online
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: -
 Degree: -

Event

hide
Title: Max-Planck-Campus Tübingen: Distinguished Speaker Seminar Series (DSSS)
Place of Event: Tübingen, Germany
Start-/End Date: 2023-06-02
Invited: Yes

Legal Case

show

Project information

show

Source

show