Social Reasoning Bench | four icons on a blue to green gradient | person icon, chat bubble icon, chart icon, checklist icon

Microsoft Research Blog

SocialReasoning-Bench: Measuring whether AI agents act in users’ best interests

May 11, 2026 | Tyler Payne, Will Epperson, Safoora Yousefi, Zachary Huang, Gagan Bansal, Wenyue Hua, Maya Murad, Asli Celikyilmaz, and Saleema Amershi

Using SocialReasoning Bench, we observed a stable pattern across models—agents execute competently, but fail to consistently improve the user’s position, even with explicit instructions to optimize for user interest.

Three minimalist white line icons on a blue-to-green gradient background: a connected globe with signal waves (left), a map location pin (center), and a lightbulb with rays (right), representing connectivity, location, and ideas.

Microsoft Research Blog

Building realistic electric transmission grid dataset at scale: a pipeline from open dataset

May 8, 2026 | Andrea Britto Mattos Lima, Thiago Vallin Spina, Weiwei Yang, Spencer Fowers, Ruslan Nagimov, and Baosen Zhang

Microsoft Research is excited to release an open dataset of approximate transmission topology of the U.S. power grid derived from publicly available data. The ability to study transmission-level power grid behavior is essential for modern power systems research. Analyses of…

Stories

When MRI images come into focus: How Tyger scales image reconstruction

May 7, 2026

Tyger moves the most demanding MRI processing to the cloud, helping researchers turn raw signals into readable images – meaning results in hours rather than days or weeks.

A scale showing coffee is worth more than gold

Articles

Whimsical Strategies Break AI Agents: Generating Out-of-Distribution Adversarial Strategies at Scale

May 6, 2026

By Zachary Huang, Tyler Payne, Gagan Bansal, Will Epperson, Wenyue Hua, Adam Fourney, Amanda Swearngin, Maya Murad, Ece Kamar, Saleema Amershi As AI agents are increasingly deployed to handle real transactions and negotiations, they can exhibit vulnerabilities that traditional safety testing struggles to fully capture. Our prior work on Magentic…

Microsoft Research Blog

Microsoft at NSDI 2026: Advances in large-scale networked systems

May 5, 2026 | Sujata Banerjee

Microsoft researchers share advances in building and operating large-scale distributed systems, spanning datacenters, networking, and the growing intersection with AI during NSDI ’26.

Articles

Webwright: A Terminal Is All You Need For Web Agents

May 4, 2026

By Yadong Lu1, Lingrui Xu2, Chao Huang2, Ahmed Awadallah11Microsoft Research, 2The University of Hong Kong Instead of solving web tasks by predicting where to click one at a time, we only give the model a terminal where it has the…

three icons on a blue to green gradient background | connected node icon, document with an 'x' icon, shield with a checkmark icon

Microsoft Research Blog

Red-teaming a network of agents: Understanding what breaks when AI agents interact at scale

April 30, 2026 | Gagan Bansal, Shujaat Mirza, Keegan Hines, Will Epperson, Zachary Huang, Whitney Maxwell, Pete Bryan, Tyler Payne, Adam Fourney, Amanda Swearngin, Wenyue Hua, Tori Westerhoff, Amanda Minnich, Maya Murad, Ece Kamar, Ram Shankar Siva Kumar, and Saleema Amershi

Safe agents don’t guarantee a safe ecosystem of interconnected agents. Microsoft Research examines what breaks when AI agents interact and why network-level risks require new approaches.

Articles

如何让生成式AI更懂你？全新交互模型IAI，重塑人机协同范式

April 30, 2026

随着生成式AI逐步走入设计、数据分析和程序开发等工作流程，甚至点餐、购物等生活场景，人们开始频繁与AI“对话”。然而，设计师反复修改提示词却始终难以贴近心中的画面；数据分析师难以用文字精确指代图表局部；程序员也很难仅凭文字就让AI准确理解特定的代码结构——“说不清楚”，正成为生成式AI时代普遍存在的交互瓶颈。文字提示灵活却天然模糊，GUI界面交互精准却表达受限，两者之间始终缺少一座连接用户意图与...

Articles

CHI上新 | 从工具到伙伴：人机协作迈入“深度共融”时代

April 30, 2026

编者按：欢迎阅读“科研上新”栏目！“科研上新”汇聚了微软亚洲研究院最新的创新成果与科研动态。在这里，你可以快速浏览研究院的亮点资讯，保持对前沿领域的敏锐嗅觉。人机交互领域最具影响力的国际顶级会议之一CHI于本周在西班牙巴塞罗那举行。本期“科研上新”精选了微软亚洲研究院入选该大会的六篇论文，展示生成式AI在创意内容创作、无障碍交互及信息可视化等领域的前沿探索。本期内容速览 1.Duo...