English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Journal Article

ProteinShake Building datasets and benchmarks for deep learning on protein structures

MPS-Authors
/persons/resource/persons291273

Kucera,  Tim
Borgwardt, Karsten / Machine Learning and Systems Biology, Max Planck Institute of Biochemistry, Max Planck Society;

/persons/resource/persons294075

Oliver,  Carlos
Borgwardt, Karsten / Machine Learning and Systems Biology, Max Planck Institute of Biochemistry, Max Planck Society;

/persons/resource/persons298953

Chen,  Dexiong
Borgwardt, Karsten / Machine Learning and Systems Biology, Max Planck Institute of Biochemistry, Max Planck Society;

/persons/resource/persons75313

Borgwardt,  Karsten
Borgwardt, Karsten / Machine Learning and Systems Biology, Max Planck Institute of Biochemistry, Max Planck Society;

Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Kucera, T., Oliver, C., Chen, D., & Borgwardt, K. (2023). ProteinShake Building datasets and benchmarks for deep learning on protein structures. Advances in Neural Information Processing Systems 36 (NeurIPS 2023).


Cite as: https://hdl.handle.net/21.11116/0000-000F-D62C-F
Abstract
We present ProteinShake, a Python software package that simplifies dataset creation and model evaluation for deep learning on protein structures. Users can create custom datasets or load an extensive set of pre-processed datasets from biological data repositories such as the Protein Data Bank (PDB) and AlphaFoldDB. Each dataset is associated with prediction tasks and evaluation functions covering a broad array of biological challenges. A benchmark on these tasks shows that pre-training almost always improves performance, the optimal data modality (graphs, voxel grids, or point clouds) is task-dependent, and models struggle to generalize to new structures. ProteinShake makes protein structure data easily accessible and comparison among models straightforward, providing challenging benchmark settings with real-world implications.
ProteinShake is available at https://proteinshake.ai.