English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  ResMiCo: increasing the quality of metagenome-assembled genomes with deep learning

Mineeva, O., Danciu, D., Schölkopf, B., Ley, R., Rätsch, G., & Youngblut, N. (2023). ResMiCo: increasing the quality of metagenome-assembled genomes with deep learning. PLoS Computational Biology, 19(5): e1011001. doi:10.1371/journal.pcbi.1011001.

Item is

Files

show Files

Locators

show

Creators

show
hide
 Creators:
Mineeva, O, Author
Danciu, D, Author
Schölkopf, B, Author           
Ley, RE1, Author                 
Rätsch, G, Author                 
Youngblut, ND1, Author                 
Affiliations:
1Department Microbiome Science, Max Planck Institute for Biology Tübingen, Max Planck Society, ou_3371684              

Content

show
hide
Free keywords: -
 Abstract: The number of published metagenome assemblies is rapidly growing due to advances in sequencing technologies. However, sequencing errors, variable coverage, repetitive genomic regions, and other factors can produce misassemblies, which are challenging to detect for taxonomically novel genomic data. Assembly errors can affect all downstream analyses of the assemblies. Accuracy for the state of the art in reference-free misassembly prediction does not exceed an AUPRC of 0.57, and it is not clear how well these models generalize to real-world data. Here, we present the Residual neural network for Misassembled Contig identification (ResMiCo), a deep learning approach for reference-free identification of misassembled contigs. To develop ResMiCo, we first generated a training dataset of unprecedented size and complexity that can be used for further benchmarking and developments in the field. Through rigorous validation, we show that ResMiCo is substantially more accurate than the state of the art, and the model is robust to novel taxonomic diversity and varying assembly methods. ResMiCo estimated 7% misassembled contigs per metagenome across multiple real-world datasets. We demonstrate how ResMiCo can be used to optimize metagenome assembly hyperparameters to improve accuracy, instead of optimizing solely for contiguity. The accuracy, robustness, and ease-of-use of ResMiCo make the tool suitable for general quality control of metagenome assemblies and assembly methodology optimization.

Details

show
hide
Language(s):
 Dates: 2023-05
 Publication Status: Published online
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: DOI: 10.1371/journal.pcbi.1011001
PMID: 37126495
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show
hide
Title: PLoS Computational Biology
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: San Francisco, CA : Public Library of Science
Pages: 20 Volume / Issue: 19 (5) Sequence Number: e1011001 Start / End Page: - Identifier: ISSN: 1553-734X
CoNE: https://pure.mpg.de/cone/journals/resource/1000000000017180_1