| Kathleen Sullivan and Amanda Craig Deckard
In the introductory episode of this new series, host Kathleen Sullivan and Senior Director Amanda Craig Deckard explore Microsoft’s efforts to draw on the experience of other domains to help advance the role of AI testing and evaluation as a…
| Amanda Craig Deckard and Chad Atalla
As generative AI becomes more capable and widely deployed, familiar questions from the governance of other transformative technologies have resurfaced. Which opportunities, capabilities, risks, and impacts should be evaluated? Who should conduct evaluations, and at what stages of the technology…
Awards | HCI INTERNATIONAL
The HCII2025 Conference is proud to announce Susan Dumais as the 2025 recipient of the 'HCI MEDAL FOR SOCIETAL IMPACT'. Susan is a pioneering researcher whose work has reshaped information retrieval, search engines, and human-computer interaction, while her innovations have…
编者按:当前,大语言模型在代码生成领域已展现出惊人的能力,但能否胜任真实软件开发中的“新增功能实现”任务,仍是一个关键未解的问题。对此,微软亚洲研究院与北京大学联合发布了首个专注于仓库级新功能实现的基准测试 FEA-Bench,填补了评估体系中的重要空白。该测试集构建于真实开源项目的 pull request (合并请求),覆盖1400多个高质量任务,系统评估了主流大模型在复杂工程任务中的表现。F...
| Rianne van den Berg, Jan Hermann, Christopher Bishop, and Paola Gori Giorgi
Microsoft researchers achieved a breakthrough in the accuracy of DFT, a method for predicting the properties of molecules and materials, by using deep learning. This work can lead to better batteries, green fertilizers, precision drug discovery, and more.
Scientific research is a continuous journey fueled by curiosity and collaboration, a conversation between scientists that often crosses continents and spans decades, with each new discovery inspired by and expanding on the work of others. The story of density functional…
| Li Lyna Zhang, Xian Zhang, Xueting Han, and Dongdong Zhang
New techniques are reimagining how LLMs reason. By combining symbolic logic, mathematical rigor, and adaptive planning, these methods enable models to tackle complex, real-world problems across a variety of fields.
编者按:在传统的机械设计和制造流程中,参数化 CAD 文件一直是概念与制造之间的关键桥梁。然而,工程师们长期以来一直被复杂的 CAD 特征树和繁琐的建模流程所困扰。近年来,随着大语言模型(LLMs)的飞速发展,AI 在多个领域展现了其强大的能力。本文将介绍三项微软亚洲研究院的最新研究——FlexCAD、CADFusion 和 CAD-Editor。它们分别从统一建模框架、视觉反馈机制和自然语言编辑...
World models are a key concept in AI, used to simulate how agents behave in virtual environments and enable immersive, interactive experiences. They’re not only transforming game and media generation, they’re also opening new frontiers for using AI in complex,…