Accelerating Biomolecular Modeling with AtomWorks and RF3

  • Nathaniel Corley ,
  • Simon V. Mathis ,
  • Rohith Krishna ,
  • Magnus Bauer ,
  • Tuscan R. Thompson ,
  • Woody Ahern ,
  • Maxwell W. Kazman ,
  • Rafael I Brent ,
  • Kieran Didi ,
  • Andrew Kubaney ,
  • Lilian McHugh ,
  • Arnav Nagle ,
  • Andrew Favor ,
  • ,
  • Pascal Sturmfels ,
  • Yanjing Li ,
  • J. Butcher ,
  • Bo Qiang ,
  • Lars L. Schaaf ,
  • Raktim Mitra ,
  • Katelyn V. Campbell ,
  • Odin Zhang ,
  • Roni Weissman ,
  • Ian R. Humphreys ,
  • Qian Cong ,
  • Jonathan Funk ,
  • Shreyash Sonthalia ,
  • Pietro Lio ,
  • David Baker ,
  • F. DiMaio

Preprint

Deep learning methods trained on protein structure databases have revolutionized biomolecular structure prediction, but developing and training new models remains a considerable challenge. To facilitate the development of new models, we present AtomWorks: a broadly applicable data framework for developing state-of-the-art biomolecular foundation models spanning diverse tasks, including structure prediction, generative protein design, and fixed backbone sequence design. We use AtomWorks to train RosettaFold-3 (RF3), a structure prediction network capable of predicting arbitrary biomolecular complexes with an improved treatment of chirality that narrows the performance gap between closed-source AlphaFold3 (AF3) and existing open-source implementations. We expect that AtomWorks will accelerate the next generation of open-source biomolecular machine learning models and that RF3 will be broadly useful as a structure prediction tool. To this end, we release the AtomWorks framework (https://github.com/RosettaCommons/atomworks (opens in new tab)), together with curated training data, code and model weights for RF3 (https://github.com/RosettaCommons/modelforge (opens in new tab)) under a permissive BSD license.

관련 도구

RosettaFold3

12월 3, 2025

RosettaFold3 (RF3) is a unified biomolecular modeling system that predicts 3D structures of proteins, nucleic acids, and small molecules within a single framework. Combining multimodal transformers and generative diffusion models, RF3 enables precise modeling of complex molecular assemblies such as protein–ligand, protein–DNA, and protein–RNA interactions.