This project investigates critical safety challenges in large-scale deployments of AI agents, focusing on privacy leakage and collusion risks in multi-agent environments. As agents collaborate and negotiate across complex tasks, they may unintentionally expose sensitive information or coordinate in ways that misalign with human values. The research develops a simulation testbed to analyse these behaviours, introduces dynamic privacy protocols, and explores how scaling agent interactions amplifies risk. Outcomes include a taxonomy of collusion patterns, mitigation strategies, and design principles for safer, transparent, and trustworthy multi-agent systems—informing future AI safety standards and governance.
This research is conducted via The Agentic AI Research and Innovation (AARI) Initiative which focuses on the next frontier of agentic systems through Grand Challenges with the academic community and Microsoft Research.
People
Jianxun Lian
Principal Researcher
Beibei Shi
Principal Research PM
Yule Wen
Undergraduate
Tsinghua University.
Xing Xie
Assistant Managing Director
Diyi Yang
Assistant Professor
Stanford University
Xiaoyuan Yi
Researcher
Yanzhe Zhang
PhD Student
Stanford University