Microsoft Research Blog

English

Training Large Reasoning Models Efficiently via Progressive Thought Encoding

February 1, 2026

Large reasoning models (LRMs) excel on complex problems but face a critical barrier to efficiency: reinforcement learning (RL) training requires long rollouts for outcome-based rewards, where autoregressive decoding dominates time and memory usage. While sliding-window cache strategies can bound memory, they disrupt long-context reasoning and…
SeerAttention-R: Sparse Attention Adaptation for Long Reasoning

February 1, 2026

We introduce SeerAttention-R, a sparse attention framework specifically tailored for the long decoding of reasoning models. Extended from SeerAttention, SeerAttention-R retains the design of learning attention sparsity through a self-distilled gating mechanism, while removing query pooling to accommodate auto-regressive decoding. With a lightweight plug-in gating,…
GenMAC: Compositional Text-to-Video Generation with Multi-Agent Collaboration

February 1, 2026

Text-to-video generation models have shown significant progress in the recent years. However, they still struggle with generating complex dynamic scenes based on compositional text prompts, such as attribute binding for multiple objects, temporal dynamics associated with different objects, and interactions between objects. Our key motivation…
Routing Channel-Patch Dependencies in Time Series Forecasting with Graph Spectral Decomposition

February 1, 2026

Time series forecasting has attracted significant attention in the field of AI. Previous works have revealed that the Channel-Independent (CI) strategy improves forecasting performance by modeling each channel individually, but it often suffers from poor generalization and overlooks meaningful inter-channel interactions. Conversely, Channel-Dependent (CD) strategies…
[Meet-up] How Could AI Supply Chain Research Shape HCI Inquiries And Vice-Versa?

February 1, 2026

HCI research on AI has largely focused on end-user interactions or the practices of developers working to improve system performance. This meetup proposes to broaden that scope of interaction to encompass the material, political, and economic dynamics that shape AI across its lifecycle. Drawing on…
Materials are Mission Critical to the Sustainability Transition: Accelerating Discovery of Materials for Sustainability

February 1, 2026 | Bichlien Nguyen

Artificial intelligence (AI) is rapidly transforming the field of materials science, offering a new paradigm to accelerate the discovery and design of sustainable materials. This article explores how AI-driven innovations are enabling breakthroughs across the material spectrum, from recyclable polymers, coolants, to low-carbon cement and…
interwhen: A Generalizable Framework for Verifiable Reasoning with Test-time Monitors

February 1, 2026

We present a test-time verification framework, interwhen, that ensures that the output of a reasoning model is valid wrt. a given set of verifiers. Verified reasoning is an important goal in high-stakes scenarios such as deploying agents in the physical world or in domains such…
Text2Arch: A Dataset for Generating Scientific Architecture Diagrams from Natural Language Descriptions

February 1, 2026 | Shivank Garg, Sankalp Mittal, and Manish Gupta

Communicating complex system designs or scientific processes through text alone is inefficient and prone to ambiguity. A system that automatically generates scientific architecture diagrams from text with high semantic fidelity can be useful in multiple applications like enterprise architecture visualization, AI-driven software design, and educational…
TrustGen: A Platform of Dynamic Benchmarking on the Trustworthiness of Generative Foundation Models

February 1, 2026 | TrustGen Team and Jianfeng Gao

Generative foundation models (GenFMs), such as large language models and text-to-image systems, have demonstrated remarkable capabilities in various downstream applications. As they are increasingly deployed in high-stakes applications, assessing their trustworthiness has become both a critical necessity and a substantial challenge. Existing evaluation efforts are…
Synergizing Understanding and Generation with Interleaved Analyzing-Drafting Thinking

February 1, 2026

Unified Vision–Language Models (UVLMs) aim to advance multimodal learning by supporting both understanding and generation within a single framework. However, existing approaches largely focus on architectural unification while overlooking the need for explicit interaction between the two capabilities during task solving. As a result, current…
SUTRADHARA : An Intelligent Orchestrator-Engine Co-design for Tool-based Agentic Inference

February 1, 2026

Agentic applications are LLM that iteratively invoke external tools to accomplish complex tasks. Such tool-based agents are rapidly becoming the dominant paradigm for deploying language models in production. Unlike traditional single-turn inference, agentic workloads chain together multiple LLM calls and tool executions before producing a…
RedCodeAgent: Automatic Red-teaming Agent against Diverse Code Agents

February 1, 2026

Code agents have gained widespread adoption due to their strong code generation capabilities and integration with code interpreters, enabling dynamic execution, debugging, and interactive programming capabilities. While these advancements have streamlined complex workflows, they have also introduced critical safety and security risks. Current static safety…

No results