English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Unifying the known and unknown microbial coding sequence space

Vanni, C., Schechter, M. S., Acinas, S. G., Barberan, A., Buttigieg, P. L., Casamayor, E. O., et al. (2022). Unifying the known and unknown microbial coding sequence space. ELIFE, 11: e67667. doi:10.7554/eLife.67667.

Item is

Files

show Files
hide Files
:
elife-67667-v2.pdf (Publisher version), 8MB
Name:
elife-67667-v2.pdf
Description:
-
OA-Status:
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
-
Copyright Info:
-
License:
-

Locators

show

Creators

show
hide
 Creators:
Vanni, Chiara1, Author           
Schechter, Matthew S.1, Author           
Acinas, Silvia G.2, Author
Barberan, Albert2, Author
Buttigieg, Pier Luigi2, Author
Casamayor, Emilio O.2, Author
Delmont, Tom O.2, Author
Duarte, Carlos M.2, Author
Eren, A. Murat2, Author
Finn, Robert D.2, Author
Kottmann, Renzo1, Author           
Mitchell, Alex2, Author
Sanchez, Pablo2, Author
Siren, Kimmo2, Author
Steinegger, Martin2, Author
Gloeckner, Frank Oliver2, Author
Fernandez-Guerra, Antonio1, Author           
Affiliations:
1Microbial Genomics Group, Department of Molecular Ecology, Max Planck Institute for Marine Microbiology, Max Planck Society, ou_2481697              
2external, ou_persistent22              

Content

show
hide
Free keywords: PHYLOGENETIC ANALYSIS; DARK-MATTER; PROTEIN; METAGENOMICS; PHOTOTROPHY; COMMUNITY; FAMILIES; CLUSTERS; ARCHAEA; COMPLEXLife Sciences & Biomedicine - Other Topics; microbial genomics; bioinformatics; gene clusters; functional metageomics; phylogenomics; unknown function; Other;
 Abstract: Genes of unknown function are among the biggest challenges in molecular biology, especially in microbial systems, where 40-60% of the predicted genes are unknown. Despite previous attempts, systematic approaches to include the unknown fraction into analytical workflows are still lacking. Here, we present a conceptual framework, its translation into the computational workflow AGNOSTOS and a demonstration on how we can bridge the known-unknown gap in genomes and metagenomes. By analyzing 415,971,742 genes predicted from 1749 metagenomes and 28,941 bacterial and archaeal genomes, we quantify the extent of the unknown fraction, its diversity, and its relevance across multiple organisms and environments. The unknown sequence space is exceptionally diverse, phylogenetically more conserved than the known fraction and predominantly taxonomically restricted at the species level. From the 71 M genes identified to be of unknown function, we compiled a collection of 283,874 lineage-specific genes of unknown function for Cand. Patescibacteria (also known as Candidate Phyla Radiation, CPR), which provides a signifi-cant resource to expand our understanding of their unusual biology. Finally, by identifying a target gene of unknown function for antibiotic resistance, we demonstrate how we can enable the genera-tion of hypotheses that can be used to augment experimental data.

Details

show
hide
Language(s): eng - English
 Dates: 2022-03-31
 Publication Status: Published online
 Pages: 60
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: ISI: 000804464600001
DOI: 10.7554/eLife.67667
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show
hide
Title: ELIFE
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: SHERATON HOUSE, CASTLE PARK, CAMBRIDGE, CB3 0AX, ENGLAND : eLIFE SCIENCES PUBL LTD
Pages: - Volume / Issue: 11 Sequence Number: e67667 Start / End Page: - Identifier: ISSN: 2050-084X