Microsoft Research Blog

Evaluating Proactive AI Mediators in Multi-Party Conversation with ProMediate 

公開済み

By Ziyi Liu (opens in new tab)Bahar Sarrafzadeh, Pei Zhou, Longqi Yang (opens in new tab)Ashish Sharma  

Imagine you are in a high-stakes group discussion, stuck in a circular argument with no consensus in sight. Now, imagine an AI agent sitting at that table. Unlike traditional tools that wait for a prompt, this agent proactively intervenes at the perfect moment with a breakthrough suggestion. This scenario represents the emerging shift from passive AI assistants to proactive team collaborators. 

As LLMs evolve toward handling multi-party teamwork, they are increasingly expected to navigate complex group dynamics. However, current research often overlooks the nuance of these interactions. While agents are being designed for multi-party settings, we still lack the benchmarks to evaluate the real-time effectiveness of their interventions or their broader socio-cognitive intelligence when working with groups of people. 

In real-world dynamics—like a high-stakes budget negotiation—success is not just about the final outcome; it is about navigating hidden interests, managing “negotiation fatigue,” and knowing exactly when to speak up to break a deadlock. To address these gaps, we introduce ProMediate: a new framework from Microsoft Office of Applied Research designed to evaluate the next generation of proactive agents in multi-party negotiations.

graphical user interface, text, application, chat or text message

ProMediate moves beyond static benchmarks by introducing a dynamic architecture that evaluates how agents handle the social nuances of human interaction. The framework consists of two integrated parts:

  1. Real Case Scenarios: The framework utilizes a collection of high-stakes, multi-issue negotiation cases. These scenarios are built upon asymmetric information and conflicting interests, requiring participants to navigate a complex bargaining space.  
  1. Proactive Conversation Simulation: ProMediate features a simulation environment with a plug-and-play proactive mediator role. The environment uses LLM-based agents to mimic human negotiators, complete with distinct personas and mental state trajectories. The mediator monitors the dialogue and determines the optimal intervention tempo—deciding not only what to say, but exactly when to intervene to steer the group toward consensus. 

How does our evaluation framework capture the dynamic of group consensus change? 

Heatmap

Instead of focusing only on the final outcome, it  tracks dynamic group decision-making. It extracts attitudes for each person on each topic and calculates the agreement score among all parties at each step. This provides a clear consensus change trend with enriched signals, as researchers can directly observe where the consensus is going up or down, and if the mediator’s intervention improves the consensus or not.  

Socio-cognitive level evaluation 

Besides consensus tracking which focuses on the conversation-level evaluation, we also evaluate the mediator’s behavior using socio-cognitive intelligence. Single reliance on the consensus change might not reveal the full capability of the mediator, as it is possible that the mediator tries to facilitate the negotiation but that humans won’t follow. So we evaluate the mediator’s behavior by only focusing on those 4 dimensions: perceptual differences, negative emotions, cognitive challenges and communication breakdown.

Socially Intelligent Agent VS Generic Agent 

generic mediator is essentially a general chat-room agent — it joins the conversation and responds, but without any specialized playbook. A socially intelligent mediator, on the other hand, is equipped with mediation-specific skills in thinking and strategic planning. We compare two mediator agents in different difficulty settings.  

The difference shows up clearly in ProMediate’s hardest setting, where participants are actively competing. The socially intelligent mediator produced meaningfully larger gains in consensus than the generic baseline, helping the group close more ground toward agreement. It was also noticeably faster to step in, catching friction points before they hardened into deadlock. In short, it didn’t just talk better — it acted sooner and hit the right moments. Social intelligence, it turns out, is what separates a chat-room bystander from a mediator that actually moves the needle. 

table

Looking ahead 

As LLMs continue to expand in capability, the advantages of proactive, socio-cognitive mediation are likely to compound. Our results indicate that while advanced reasoning models show promise, the ability to successfully navigate a multi-party negotiation depends on more than just raw scale; it requires a specialized interaction strategy that can decode the social “pulse” of a room. 

For teams building intelligent features on collaborative platforms—from AI meeting assistants to automated conflict resolution tools—this work offers a clear message: the architecture of a mediator’s proactivity matters as much as the model’s size. By using the ProMediate metrics as high-signal reward labels, we can transition from simple prompting to sophisticated training. This allows us to develop agents that learn not just what to speak, but critically, when to speak within the complex rhythm of multi-turn interactions. 

As the digital environments where we collaborate grow more complex, the need for principled, socially aware intervention will only become more vital. The core insight of our work is that effective mediation is not just about the final deal: it is about the socio-cognitive intelligence required to guide the journey from conflict to consensus. 

Read the full paper: [2510.25224] ProMediate: A Socio-cognitive framework for evaluating proactive agents in multi-party negotiation (opens in new tab) 

続きを読む

すべてのブログ記事を見る