English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  De novo discovery of conserved gene clusters in microbial genomes with Spacedust

Zhang, R., Mirdita, M., & Söding, J. (2024). De novo discovery of conserved gene clusters in microbial genomes with Spacedust. bioRxiv. doi:10.1101/2024.10.02.616292.

Item is

Files

show Files
hide Files
:
2024.10.02.616292v1.full.pdf (Preprint), 2MB
Name:
Preprint
Description:
-
OA-Status:
Green
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
-
Copyright Info:
-

Locators

show

Creators

show
hide
 Creators:
Zhang, Ruoshi1, Author                 
Mirdita, Milot, Author
Söding, Johannes1, Author                 
Affiliations:
1Research Group of Quantitative and Computational Biology, Max Planck Institute for Multidisciplinary Sciences, Max Planck Society, ou_3350226              

Content

show
hide
Free keywords: -
 Abstract: Metagenomics has revolutionized environmental and human-associated microbiome studies. However, the limited fraction of proteins with known biological process and molecular functions presents a major bottleneck. In prokaryotes and viruses, evolution favors keeping genes participating in the same biological processes co-localized as conserved gene clusters. Conversely, conservation of gene neighborhood indicates functional association. Spacedust is a tool for systematic, de novo discovery of conserved gene clusters. To find homologous protein matches it uses fast and sensitive structure comparison with Foldseek. Partially conserved clusters are detected using novel clustering and order conservation P-values. We demonstrate Spacedust’s sensitivity with an all-vs-all analysis of 1 308 bacterial genomes, identifying 72 843 conserved gene clusters containing 58% of the 4.2 million genes. It recovered recover 95% of antiviral defense system clusters annotated by a specialized tool. Spacedust’s high sensitivity and speed will facilitate the large-scale annotation of the huge numbers of sequenced bacterial, archaeal and viral genomes.

Details

show
hide
Language(s): eng - English
 Dates: 2024-10-03
 Publication Status: Published online
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: No review
 Identifiers: DOI: 10.1101/2024.10.02.616292
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show
hide
Title: bioRxiv
Source Genre: Web Page
 Creator(s):
Affiliations:
Publ. Info: -
Pages: - Volume / Issue: - Sequence Number: - Start / End Page: - Identifier: -