SocialReasoning-Bench: Measuring whether AI agents act in users’ best interests
Using SocialReasoning Bench, we observed a stable pattern across models—agents execute competently, but fail to consistently improve the user’s position, even with explicit instructions to optimize for user interest.
Building realistic electric transmission grid dataset at scale: a pipeline from open dataset
Microsoft Research is excited to release an open dataset of approximate transmission topology of the U.S. power grid derived from publicly available data. The ability to study transmission-level power grid behavior is essential for modern…
Principal Applied Scientist
As a Principal Applied Scientist, you’ll lead the science behind Discover’s ranking, user understanding, and content understanding stack, combining LLMs, multimodal models, and large‑scale recommender systems to drive measurable gains in engagement, satisfaction, and trust.…
Language & Voice AI for Africa: From Data to Deployment and Impact
This seminar explores how language and voice AI systems can be built and scaled for African contexts—from community-driven data collection and multilingual foundation models to robust deployment and real-world applications across sectors such as agriculture,…
Research Intern – AI Agents & Efficiency
Join us to push the frontier of AI efficiency and agentic systems. At M365 Research, we work at the intersection of cutting-edge research and product impact at global scale. We partner with research and product…
Research Intern – AI Frontiers
The AI Frontier Lab at Microsoft Research is seeking candidates to advance the state of the art in agentic model capabilities — creating models and agents that can reliably perform tasks across digital systems on…