ausblenden:
Schlagwörter:
-
Zusammenfassung:
Introduction: Cross-linking mass spectrometry is a high-throughput technique for the characterization of protein structures and interactions. Cross-linking data usually consists of pairs of cross-linked residues within a protein or between proximate protein subunits that reflect known and novel information about protein structure and interactions. Those cross-links often contain untapped potential concerning multimeric protein complexes. Cross-links in homo-oligomers, for example, pose a challenge as the linked proteins possess identical sequences, making the distinction between intra-protein links and links between multiple identical subunits difficult. Here we introduce CLAUDIO, an open-source pipeline for structural analysis and validation of protein cross-linking data, and detection of homo-oligomerization signals.
Methods: CLAUDIO follows two main approaches, a structure analysis, where cross-links are validated by mapping them on their corresponding structures, if they exist, and comparing their topological distances to linker ranges, and a peptide sequence overlap analysis, which points to homo-oligomers, for intra-protein links.
Preliminary data: CLAUDIO’s input includes the UniProt IDs of the proteins, the sequences of the cross-linked peptides, and the positions of the cross-linked residues. CLAUDIO implements a workflow that utilizes these data and evaluates them based on structures from The Protein Data Bank and AlphaFold Protein Structure Database using TopoLink. For intra-protein links, CLAUDIO can extract signals of overlapping peptide sequences and map them by homology to homo-oligomers from SWISS-MODEL. CLAUDIO’s output consists of the initial data extended by the results of distance evaluation and cross-link type re-assignments. We applied CLAUDIO to a cross-linking dataset of ~4186 cross-links from murine mitochondria, derived from two studies. It was able to structurally validate 82.7% of the intra-links and 71,7 % of the inter-links, for which structures could be found, within ~100 minutes. It was able to reassign about 37.3% of the intra-protein links as potential signals of homo-oligomers, of which it validated 65,9% by homology. The remaining 34,1% may be interesting leads for homo-oligomers currently unidentified by SWISS-MODEL.
Novel aspect: CLAUDIO can automatically validate protein cross-links from large datasets and find new signals of homo-oligomers.