PadChest-GR: A bilingual grounded radiology reporting benchmark for chest X-rays
The world’s first multimodal, bilingual radiology dataset could reshape the way radiologists and AI systems make sense of X-rays. PadChest-GR, developed by the University of Alicante with Microsoft Research, has the potential to advance research…
Microsoft Research Accurate Chemistry Collection (MSR-ACC)
The Skala functional will enable more accurate, scalable predictions in computational chemistry. It starts with the largest high-accuracy dataset ever built for training deep-learning-based density functional theory (DFT) models. This dataset underpins Skala—coming soon to…
AI Testing and Evaluation: Learnings from Science and Industry
In the introductory episode of this new series, host Kathleen Sullivan and Senior Director Amanda Craig Deckard explore Microsoft’s efforts to draw on the experience of other domains to help advance the role of AI…
Learning from other domains to advance AI evaluation and testing
As generative AI becomes more capable and widely deployed, familiar questions from the governance of other transformative technologies have resurfaced. Which opportunities, capabilities, risks, and impacts should be evaluated? Who should conduct evaluations, and at…