Discover an index of datasets, SDKs, APIs and open-source tools developed by Microsoft researchers and shared with the global academic community below. These experimental technologies—available through Azure AI Foundry Labs (opens in new tab)—offer a glimpse into the future of AI innovation.
BAPO – Bounded Attention Prefix Oracle
This repository contains all scripts for re-producing the results of our paper “Lost in Transmission: When and Why LLMs Fail to Reason Globally”.
Microsoft Research Accurate Chemistry Collection (MSR-ACC)
The Skala functional will enable more accurate, scalable predictions in computational chemistry. It starts with the largest high-accuracy dataset ever built for training deep-learning-based density functional theory (DFT) models. This dataset underpins Skala—coming soon to…
Science Foundation Model
We develop the Science Foundation Model to empower natural scientists and accelerate breakthroughs in scientific discovery. As part of this effort, we introduce the sequence-based model, Nature Language Model (NatureLM). NatureLM is designed to span…
EfficientXLang
This codebase is the official implementation of “EfficientXLang: Towards Improving Token Efficiency Through Cross-Lingual Reasoning.”
EvoDiff
EvoDiff is a general-purpose diffusion framework that combines evolutionary-scale data with the distinct conditioning capabilities of diffusion models for controllable protein generation in sequence space. EvoDiff generates high-fidelity, diverse, and structurally-plausible proteins that cover natural…