Microsoft Research Blog

English

  1. Research Intern – Applied Speech Research 

    February 3, 2026

    The Health and Life Sciences (HLS) group at Microsoft is dedicated to advancing healthcare and life sciences through innovative technology solutions, fostering collaboration, and driving impactful research to improve patient outcomes and streamline healthcare operations. We leverage various speech technologies to improve patient outcomes, reduce…

  2. Research Intern – Agentic Programming 

    February 2, 2026

    Research Interns put inquiry and theory into practice. Alongside fellow doctoral candidates and some of the world’s best researchers, Research Interns learn, collaborate, and network for life. Research Interns not only advance their own careers, but they also contribute to exciting research and development strides.…

  3. diagram

    Agentic Media 

    February 2, 2026 | Yun Wang

    Communication-centered computing Communication shapes how societies create, share, and preserve knowledge. Yet today’s digital tools remain organized around static formats that enforce rigid separations between creation and consumption, author and reader, expression and interpretation. These structural constraints fragment collaboration and limit how knowledge evolves over…

  4. AgentRx: Diagnosing AI Agent Failures from Execution Trajectories 

    February 2, 2026

    AI agents often fail in ways that are difficult to localize because executions are probabilistic, long-horizon, multi-agent, and mediated by noisy tool outputs. We address this gap by manually annotating failed agent runs and release a novel benchmark of 115 failed trajectories spanning structured API…

  5. VibeVoice: Expressive Podcast Generation with Next-Token Diffusion 

    February 1, 2026

    Generating long-form, multi-speaker conversational audio like podcasts poses significant challenges for traditional Text-to-Speech (TTS) systems, particularly in scalability, speaker consistency, and natural turn-taking. We present VibeVoice , a novel model designed to synthesize expressive, long-form speech with multiple speakers in a zero-shot manner. A core…

  6. Training Large Reasoning Models Efficiently via Progressive Thought Encoding 

    February 1, 2026

    Large reasoning models (LRMs) excel on complex problems but face a critical barrier to efficiency: reinforcement learning (RL) training requires long rollouts for outcome-based rewards, where autoregressive decoding dominates time and memory usage. While sliding-window cache strategies can bound memory, they disrupt long-context reasoning and…

  7. SeerAttention-R: Sparse Attention Adaptation for Long Reasoning 

    February 1, 2026

    We introduce SeerAttention-R, a sparse attention framework specifically tailored for the long decoding of reasoning models. Extended from SeerAttention, SeerAttention-R retains the design of learning attention sparsity through a self-distilled gating mechanism, while removing query pooling to accommodate auto-regressive decoding. With a lightweight plug-in gating,…

  8. Materials are Mission Critical to the Sustainability Transition: Accelerating Discovery of Materials for Sustainability 

    February 1, 2026 | Bichlien Nguyen

    Artificial intelligence (AI) is rapidly transforming the field of materials science, offering a new paradigm to accelerate the discovery and design of sustainable materials. This article explores how AI-driven innovations are enabling breakthroughs across the material spectrum, from recyclable polymers, coolants, to low-carbon cement and…

  9. GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration 

    February 1, 2026

    Text-to-video generation models have shown significant progress in the recent years. However, they still struggle with generating complex dynamic scenes based on compositional text prompts, such as attribute binding for multiple objects, temporal dynamics associated with different objects, and interactions between objects. Our key motivation…

  10. Routing Channel-Patch Dependencies in Time Series Forecasting with Graph Spectral Decomposition 

    February 1, 2026

    Time series forecasting has attracted significant attention in the field of AI. Previous works have revealed that the Channel-Independent (CI) strategy improves forecasting performance by modeling each channel individually, but it often suffers from poor generalization and overlooks meaningful inter-channel interactions. Conversely, Channel-Dependent (CD) strategies…