Microsoft Research Blog

English

From User Surveys to Telemetry-Driven Agents: Exploring the Potential of Personalized Productivity Solutions

October 18, 2025

We present a comprehensive, user-centric approach to understand preferences in AI-based productivity agents and develop personalized solutions tailored to users' needs. Utilizing a two-phase method, we first conducted a survey with 363 participants, exploring various aspects of productivity, communication style, agent approach, personality traits, personalization,…
VAGEN: Reinforcing World Model Reasoning for Multi-Turn VLM Agents

October 18, 2025

A key challenge in training Vision-Language Model (VLM) agents, compared to Language Model (LLM) agents, lies in the shift from textual states to complex visual observations. This transition introduces partial observability and demands robust world modeling. We ask: Can VLM agents construct internal world models…
Distant conversational speech recognition: Challenges and Opportunities

October 17, 2025 | Dr. Samuele Cornell and Sunit Sivasankaran

State-of-the-art ASR systems excel on close-talk benchmarks but struggle with far-field conversational speech, where error rates remain above 20%. Current benchmark datasets inadequately assess generalization across domains and real-world conditions, often relying on oracle segmentation that yields overly optimistic results. Distant ASR (DASR) faces unique…
Ultra Ethernet for next-generation AI and HPC workloads

October 17, 2025 | Torsten Hoefler, Abdul Kabbani, and Sujata Banerjee

The Ultra Ethernet Consortium set out to redefine Ethernet-based interconnects for AI and high-performance computing (HPC), culminating in the recent release of its first specification (version 1.0). This talk will highlight key innovations that distinguish Ultra Ethernet from existing solutions, ranging from lossy operation—both with…
BRAIN SIGNALS TO ACTION: Monitoring and Explaining User Cognitive Load with Foundation Models

October 17, 2025 | Deeksha Moodasarige Shama and Dimitra Emmanouilidou

Passive monitoring of cognitive load can enable personalized user experiences and even accelerate human learning by leveraging closed-loop adaptive training systems. Electroencephalography (EEG) provides a cost-effective, non-invasive window into brain activity, yet conventional methods struggle with cross-subject variability. Leveraging the power of large pretrained brain…
IronDict: Transparent Dictionaries from Polynomial Commitments

October 17, 2025 | Hossein Hafezi and Melissa Chase

We present IronDict, a transparent dictionary construction based on polynomial commitment schemes. Transparent dictionaries enable an untrusted server to maintain a mutable dictionary and provably serve clients lookup queries. A major open challenge is supporting efficient auditing by lightweight clients. Previous solutions either incurred high…
Lattice-Based Accumulator and Application to Anonymous Credential Revocation

October 17, 2025 | Victor Youdom Kemmoe and Betül Durak

An accumulator is a cryptographic system for compactly representing a set of elements such that every element in the set has a short membership witness. A dynamic accumulator, furthermore, allows elements to be added to and deleted from the accumulator. Camenisch and Lysyanskaya (CRYPTO’02) constructed…
FOA Tokenizer: Learning Discrete Representations of Spatial Audio with Multichannel VQ-GAN

October 17, 2025 | Parthasaarathy Sudarsanam and Hannes Gamper

Spatial audio captures the directional and environmental characteristics of sound, enabling immersive listening experiences. First-Order Ambisonics (FOA) provides a compact representation of spatial audio by encoding the sound field’s directional components across four channels, allowing full-scene coverage independent of microphone array geometry. A key advantage…
Efficient Secure Aggregation for Federated Learning

October 17, 2025 | Varun  Madathil and Melissa Chase

Federated Learning (FL) trains a global model by having each selected device push only its model update to a central server, keeping raw data local. However, those updates can still leak sensitive information unless the server learns only their sum. A naïve approach is to run…
Cost-Aware Retrieval-Augmentation Reasoning Models with Adaptive Retrieval Depth

October 17, 2025 | Helia Hashemi, Victor Ruehle, and Saravan Rajmohan

Reasoning models have gained significant attention due to their strong performance, particularly when enhanced with retrieval augmentation. However, these models often incur high computational costs, as both retrieval and reasoning tokens contribute substantially to the overall resource usage. In this work, we make the following…
Microsoft Research Asia — StarLeap Program

October 16, 2025

The StarLeap Program, launched by Microsoft Research Asia (MSRA), is designed to provide exceptional students with the opportunity to collaborate with multiple research teams at MSRA and to address real-world, frontier research challenges. Since its establishment in January 2021, the program has received enthusiastic responses…
Towards a Responsible AI Organizational Maturity Model

October 16, 2025

Artificial intelligence (AI) holds tremendous potential but also poses consequential risks. Regulation frameworks like the EU AI Act aim to mitigate these risks, yet organizations struggle to understand and operationalize Responsible AI (RAI). We introduce the RAI Organizational Maturity (RAI-OM) framework as an initial step…

No results