English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Journal Article

Imputation of label-free quantitative mass spectrometry-based proteomics data using self-supervised deep learning

MPS-Authors
/persons/resource/persons78356

Mann,  Matthias
Mann, Matthias / Proteomics and Signal Transduction, Max Planck Institute of Biochemistry, Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Webel, H., Niu, L., Nielsen, A. B., Locard-Paulet, M., Mann, M., Jensen, L. J., et al. (2024). Imputation of label-free quantitative mass spectrometry-based proteomics data using self-supervised deep learning. Nature Communications, 15(1): 5405. doi:10.1038/s41467-024-48711-5.


Cite as: https://hdl.handle.net/21.11116/0000-000F-8D00-2
Abstract
Imputation techniques provide means to replace missing measurements with a value and are used in almost all downstream analysis of mass spectrometry (MS) based proteomics data using label-free quantification (LFQ). Here we demonstrate how collaborative filtering, denoising autoencoders, and variational autoencoders can impute missing values in the context of LFQ at different levels. We applied our method, proteomics imputation modeling mass spectrometry (PIMMS), to an alcohol-related liver disease (ALD) cohort with blood plasma proteomics data available for 358 individuals. Removing 20 percent of the intensities we were able to recover 15 out of 17 significant abundant protein groups using PIMMS-VAE imputations. When analyzing the full dataset we identified 30 additional proteins (+13.2%) that were significantly differentially abundant across disease stages compared to no imputation and found that some of these were predictive of ALD progression in machine learning models. We, therefore, suggest the use of deep learning approaches for imputing missing values in MS-based proteomics on larger datasets and provide workflows for these.
Imputation in mass spectrometry-based proteomics is a recurrent step of importance for downstream analysis. Here, the authors offer an extensive comparison workflow of 27 established with three new scalable, fast and performant methods from deep learning for large and high-dimensional data.