Microsoft Research Blog

Artificial intelligence

SENSE-7: Taxonomy and Dataset for Measuring User Perceptions of Empathy in Sustained Human-AI Conversations

September 30, 2025

Empathy is increasingly recognized as a key factor in human–AI communication, yet conventional approaches to “digital empathy” often focus on simulating internal, human like emotional states while overlooking the inherently subjective, contextual, and relational facets of empathy as perceived by users. In this work, we…
Improving Language Agents Through BREW

September 29, 2025

Large Language Model (LLM)-based agents are increasingly applied to tasks requiring structured reasoning, tool use, and environmental adaptation, such as data manipulation, multistep planning, and computer-use automation. However, despite their versatility, current training paradigms for model weight optimization methods, like PPO and GRPO, remain relatively…
STACKFEED: Structured Textual Actor-Critic Knowledge Base Editing with Feedback

September 22, 2025

Large Language Models (LLMs) are increasingly used for complex software engineering tasks but often generate incorrect or outdated code. Retrieval-Augmented Generation systems attempt to solve this by using external knowledge bases (KB) like API documentation, but in the fast-paced world of software development, this documentation…
Learning from other domains to advance AI evaluation and testing

August 11, 2025 | Office of Responsible AI

Drawing on our analysis of eight case studies prepared by independent academic and industry experts, this white paper proposes next steps to address AI evaluation and testing challenges and opportunities by: Synthesizing insights from the eight case studies, also published separately, and extracting lessons relevant…
Cognitive Loop via In-Situ Optimization: Self-Adaptive Reasoning for Science

August 4, 2025 | Newman Cheng, Gordon Broadbent, and William Chappell

The capacity for artificial intelligence (AI) to formulate, evolve, and test altered thought patterns under dynamic conditions indicates advanced cognition that is crucial for scientific discovery. The existing AI development landscape falls into two categories: 1) frameworks over non-reasoning models that natively incorporate opinions on…
DiffRefine: Diffusion-based Proposal Specific Point Cloud Densification for Cross-Domain Object Detection

August 1, 2025

The robustness of 3D object detection in large-scale outdoor point clouds degrades significantly when deployed in an unseen environment due to domain shifts. To minimize the domain gap, existing works on domain adaptive detection focuses on several factors, including point density, object shape and sizes,…
Understanding, Protecting, and Augmenting Human Cognition with Generative AI: A Synthesis of the CHI 2025 Tools for Thought Workshop

August 1, 2025

Generative AI (GenAI) radically expands the scope and capability of automation for work, education, and everyday tasks, a transformation posing both risks and opportunities for human cognition. How will human cognition change, and what opportunities are there for GenAI to augment it? Which theories, metrics,…
RenderFormer: Transformer-based Neural Rendering of Triangle Meshes with Global Illumination

August 1, 2025

We present RenderFormer, a neural rendering pipeline that directly renders an image from a triangle-based representation of a scene with full global illumination effects and that does not require per-scene training or fine-tuning. Instead of taking a physics-centric approach to rendering, we formulate rendering as…
AdaptAgent: Adapting Multimodal Web Agents with Few-Shot Learning from Human Demonstrations

July 27, 2025

State-of-the-art multimodal web agents, powered by Multimodal Large Language Models (MLLMs), can autonomously execute many web tasks by processing user instructions and interacting with graphical user interfaces (GUIs). Current strategies for building web agents rely on (i) the generalizability of underlying MLLMs and their steerability…
Closed-loop optimization using machine learning for the accelerated design of sustainable cements incorporating algal biomatter

July 7, 2025 | Meng-Yen Lin, Kristen Severson, Paul Grandgeorge, and Eleftheria Roumeli

The substantial embodied carbon of cement, coupled with the ever-increasing need for construction materials, motivates the need for more sustainable cementitious materials. An emerging strategy to mitigate CO2 emissions involves incorporating carbon-negative biomatter; however, this introduces new challenges due to complex hydration-strength relationships and the combinatorial…
Ownership, Not Just Happy Talk: Co-Designing a Participatory Large Language Model for Journalism

June 23, 2025

Journalism has emerged as an essential domain for understanding the uses, limitations, and impacts of large language models (LLMs) in the workplace. News organizations face divergent financial incentives: LLMs already permeate newswork processes within financially constrained organizations, even as ongoing legal challenges assert that AI…
Scaling Textual Gradients via Sampling-Based Momentum

June 1, 2025

As prompts play an increasingly critical role in large language models (LLMs), optimizing textual prompts has become a crucial challenge. The Textual Gradient Descent (TGD) framework has emerged as a promising data-driven approach that iteratively refines textual prompts using LLM - suggested updates (or textual…

No results