Microsoft Research Blog

Learning from other domains to advance AI evaluation and testing

June 23, 2025 | Amanda Craig Deckard and Chad Atalla
As generative AI becomes more capable and widely deployed, familiar questions from the governance of other transformative technologies have resurfaced. Which opportunities, capabilities, risks, and impacts should be evaluated? Who should conduct evaluations, and at what stages of the technology lifecycle? What tests or measurements…

Recent Posts

  1. Research Focus: May 07, 2025

    Research Focus: Week of May 7, 2025 

    May 7, 2025

    In this issue: New research on compound AI systems and causal verification of the Confidential Consortium Framework; release of Phi-4-reasoning; enriching tabular data with semantic structure, and more.

  2. Research Focus: April 23, 2025

    Research Focus: Week of April 21, 2025 

    April 23, 2025

    In this issue: our CHI 2025 & ICLR 2025 contributions, plus research on causal reasoning & LLMs; countering LLM jailbreak attacks; and how people use AI vs. AI-alone. Also, SVP of Microsoft Health Jim Weinstein talks rural healthcare innovation.

Explore More