Microsoft Research Blog

English

Research Intern – Applied Speech Research

February 3, 2026

The Health and Life Sciences (HLS) group at Microsoft is dedicated to advancing healthcare and life sciences through innovative technology solutions, fostering collaboration, and driving impactful research to improve patient outcomes and streamline healthcare operations. We leverage various speech technologies to improve patient outcomes, reduce…
When Minutes Matter: Advancing Wildfire Early Detection with ALERTCalifornia

February 3, 2026 | Juan M. Lavista Ferres

Strengthening wildfire response takes more than any single institution, any single technology, or any single moment of heroism. It takes sustained collaboration between the people building new tools and the first responders relying on them under the harshest conditions.
Research Intern – Agentic Programming

February 2, 2026

Research Interns put inquiry and theory into practice. Alongside fellow doctoral candidates and some of the world’s best researchers, Research Interns learn, collaborate, and network for life. Research Interns not only advance their own careers, but they also contribute to exciting research and development strides.…
Agentic Media

February 2, 2026 | Yun Wang

Communication-centered computing Communication shapes how societies create, share, and preserve knowledge. Yet today’s digital tools remain organized around static formats that enforce rigid separations between creation and consumption, author and reader, expression and interpretation. These structural constraints fragment collaboration and limit how knowledge evolves over…
AgentRx: Diagnosing AI Agent Failures from Execution Trajectories

February 2, 2026

AI agents often fail in ways that are difficult to localize because executions are probabilistic, long-horizon, multi-agent, and mediated by noisy tool outputs. We address this gap by manually annotating failed agent runs and release a novel benchmark of 115 failed trajectories spanning structured API…
One Model, All Roles: Multi-Turn, Multi-Agent Self-Play Reinforcement Learning for Conversational Social Intelligence

February 2, 2026

This paper introduces OMAR: One Model, All Roles, a reinforcement learning framework that enables AI to develop social intelligence through multi-turn, multi-agent conversational self-play. Unlike traditional paradigms that rely on static, single-turn optimizations, OMAR allows a single model to role-play all participants in a conversation…
VibeVoice: Expressive Podcast Generation with Next-Token Diffusion

February 1, 2026

Generating long-form, multi-speaker conversational audio like podcasts poses significant challenges for traditional Text-to-Speech (TTS) systems, particularly in scalability, speaker consistency, and natural turn-taking. We present VibeVoice , a novel model designed to synthesize expressive, long-form speech with multiple speakers in a zero-shot manner. A core…
Training Large Reasoning Models Efficiently via Progressive Thought Encoding

February 1, 2026

Large reasoning models (LRMs) excel on complex problems but face a critical barrier to efficiency: reinforcement learning (RL) training requires long rollouts for outcome-based rewards, where autoregressive decoding dominates time and memory usage. While sliding-window cache strategies can bound memory, they disrupt long-context reasoning and…
SeerAttention-R: Sparse Attention Adaptation for Long Reasoning

February 1, 2026

We introduce SeerAttention-R, a sparse attention framework specifically tailored for the long decoding of reasoning models. Extended from SeerAttention, SeerAttention-R retains the design of learning attention sparsity through a self-distilled gating mechanism, while removing query pooling to accommodate auto-regressive decoding. With a lightweight plug-in gating,…
Materials are Mission Critical to the Sustainability Transition: Accelerating Discovery of Materials for Sustainability

February 1, 2026 | Bichlien Nguyen

Artificial intelligence (AI) is rapidly transforming the field of materials science, offering a new paradigm to accelerate the discovery and design of sustainable materials. This article explores how AI-driven innovations are enabling breakthroughs across the material spectrum, from recyclable polymers, coolants, to low-carbon cement and…
GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration

February 1, 2026

Text-to-video generation models have shown significant progress in the recent years. However, they still struggle with generating complex dynamic scenes based on compositional text prompts, such as attribute binding for multiple objects, temporal dynamics associated with different objects, and interactions between objects. Our key motivation…
Routing Channel-Patch Dependencies in Time Series Forecasting with Graph Spectral Decomposition

February 1, 2026

Time series forecasting has attracted significant attention in the field of AI. Previous works have revealed that the Channel-Independent (CI) strategy improves forecasting performance by modeling each channel individually, but it often suffers from poor generalization and overlooks meaningful inter-channel interactions. Conversely, Channel-Dependent (CD) strategies…

No results