Abstract teal-toned image of aluminum cans viewed from above, overlaid with a network of connected flowchart shapes (rectangles, rounded nodes, and diamonds) linked by thin line.

Stories

OptiMind: When the system meets the floor

June 3, 2026

A three‑month pilot in a Midwestern bottling plant shows what happens when AI moves beyond chat and into decision-making, where constraints shift, stakes are real, and answers must hold.

Microsoft Research at BUILD 2026 | abstract pattern on a purple background

Stories

MSR at BUILD 2026

June 2, 2026

Microsoft Research is at BUILD 2026 this week, giving developers a hands-on look at some of the many AI-based models and tools they can use to accelerate innovation, enhance their capabilities, and quickly transform ideas into prototypes.

In the news | Microsoft Azure Blog

Announcing Microsoft Discovery general availability and Microsoft Discovery app preview

June 2, 2026

Today at Microsoft Build, we are announcing that Microsoft Discovery is now generally available for all organizations, providing a comprehensive platform for building and governing agentic AI workflows across scientific and engineering disciplines. We are also introducing the Microsoft Discovery app in…

Articles

VITRA Redefines VLA Pre-training Paradigms via Human Video Reconstruction

May 29, 2026

When you see robots participating in running races or performing folk dances on stage, you might envision a future where a simple natural language command is all it takes for a robot to tidy up a desk, clean a room,…

Articles

不改架构、无需3D数据，强化学习如何让视频模型真正“理解”3D世界？

May 28, 2026

随着AI技术的快速发展，很多视频基础模型已经能够生成画面精美、风格多样的短片，但一个根本性问题始终未被解决：尽管模型擅长生成看起来逼真的画面，却并没有真正理解三维世界。当镜头旋转、推进或环绕时，这些模型生成的视频中的建筑会扭曲变形，物体会凭空消失，空间比例也常常前后矛盾。换句话说，这些模型学会了二维像素的统计规律，却尚未建立稳定的三维空间认知。为了解决这一问题，微软亚洲研究院推出了一种通过强化学...

Articles

AI医疗影像盲猜不靠谱？两大医疗智能体框架让AI学会“找证据”、“多科会诊”

May 28, 2026

“医生，我这个片子到底有没有问题？” 这可能是在医院的诊室里经常听到的一句话。面对一张复杂的医学影像，医生不仅要给出“是与否”的答案，更需要向患者解释诊断的依据：这个阴影是什么？为什么怀疑是肿瘤？具体的医学证据在哪里？而在面对疑难杂症或复杂病症时，还需要多个科室的专家联合会诊，才能形成更严谨、准确的诊断结论。近年来，具备图像理解能力的视觉语言模型（VLM）开始在医疗诊断方面展现潜力。但现有的AI...

Articles

RPG 与 RPG-Encoder：为仓库级 AI 工程，量身打造一种中间表示

May 28, 2026

近年来，很多大模型都能从自然语言描述中稳定地写出单个函数或单个文件。但如何将这种能力延伸到“从高层规格生成完整仓库”，或者“对真实仓库形成持续可用的全局理解”，目前仍处于早期阶段。而这两个看似相互独立的方向，实则共享着同一个底层困局——缺少一种适合代码仓库的中间表示。目前主流的AI智能体框架普遍依赖三类代偿性表示：由此可见，这三种表示方式在某一维度上各自可用，却都无法构成一个同时具备语义密度与...

Three minimalist white line icons on a textured blue‑green gradient background: a rising bar chart on the left, a central hub‑and‑spoke network diagram in the middle, and a checkmark inside a circle on the right.

Microsoft Research Blog

Data Formulator 0.7: AI-powered data analytics for enterprise data

May 28, 2026 | Chenglong Wang, Scott Tsukamaki, Michel Galley, and Jianfeng Gao

Data Formulator introduces AI-powered analytics for enterprise data workflows. Data teams can easily bring enterprise data into an AI-ready workspace where users can explore, analyze, and visualize data with AI agents to turn raw data into actionable insights.

Articles

AVGen-Bench：面向下一代文生音视频模型的系统化评测框架

May 28, 2026

从文生图、文生视频，到文本生成音视频（Text-to-Audio-Video, T2AV），生成模型正在快速迈向更强的多模态表达能力。与此同时，一个关键问题也愈发突出：我们究竟该如何评测这些模型？现有评测往往更关注单一模态的生成质量，难以同时衡量画面、声音、同步性、语义控制与复杂任务的执行能力。模型也许能生成“好看”的视频，却未必能做到音画一致；也许能生成“自然”的声音，却未必能准确遵循文本指令...