Mission
Our mission is to enable predictive modeling of laboratory experiments by achieving chemically accurate electronic structure predictions with deep learning powered DFT, targeting errors below 1 kcal/mol, while retaining the computational efficiency of scalable semi-local DFT.
Skala functional
At the heart of our efforts is Skala, a deep learning-based exchange-correlation (XC) functional that breaks the traditional trade-off between accuracy and efficiency. Unlike traditional XC functionals, Skala bypasses commonly used expensive hand-designed input features and instead learns complex non-local representations and uses these to make energy predictions in a data-driven manner. This is enabled by training the model using an unprecedented amount of high accuracy data which we generate in-house and in collaboration with world-leading experts of highly accurate but more expensive electronic structure methods.
Key Features:
- Learned non-local representations: Skala leverages a modern neural network to learn the nonlocal representations that are required to reach chemical accuracy. The model is trained using an unprecedented volume of high-accuracy reference data, generated using wavefunction-based methods.
- Chemical Accuracy: Skala achieves chemical accuracy for atomization energies of small molecules.
- Scalable generalization: With just a modest amount of additional training data, Skala achieves accuracy competitive with top-performing hybrid functionals across general main group chemistry, all at the cost of semi-local DFT.
- Systematic improvement with data: Skala systematically improves with more training data, expanding its predictive power across diverse chemical domains.
- Naturally supports GPU acceleration: Skala architecture is designed to take maximum advantage of GPU acceleration. The computational cost of Skala is the same as semilocal functionals.
- Emerging physical constraints: while we impose only a minimal set of exact constraints through Skala’s model design, we find that adherence to additional exact physical constraints emerges as more data is added to the training set
Accurate Chemistry Collection
Accurate electronic structure data with sub-chemical accuracy are essential for advancing computational chemistry methods with deep learning. However, existing datasets that reach this level of accuracy remain limited in size or scope. The Microsoft Research Accurate Chemistry Collection (MSR-ACC) aims to overcome this limitation. Its first release, MSR-ACC/TAE25, comprises 76,879 total atomization energies at the CCSD(T)/CBS level obtained with the W1-F12 thermochemical protocol. The dataset is constructed to exhaustively cover the chemical space of closed-shell, charge-neutral, covalently bound equilibrium molecular structures containing up to 5 non-hydrogen atoms drawn from elements up to argon and lacking significant multireference character. The dataset and its canonical train and validation splits are openly available on Zenodo in the QCSchema format under the CDLA Permissive 2.0 license. This first release of MSR-ACC enables data-driven approaches for developing predictive computational chemistry methods with unprecedented accuracy and scope.
Early access program
We invite organizations of all sizes to join the DFT Research Early Access Program (REAP) to explore the potential of our new Skala functional and accelerate innovation across industries through faster and more accurate density functional theory.
Work with us
Check out our open roles
