News & features
By Yadong Lu, Senior Researcher; Jianwei Yang, Principal Researcher; Yelong Shen, Principal Research Manager; Ahmed Awadallah, Partner Research Manager Recent advancements in large vision-language models (VLMs), such as GPT-4V and GPT-4o, have demonstrated considerable promise in driving intelligent agent systems…
Microsoft Research Forum Episode 4: The future of multimodal models, a new “small” language model, and other AI updates
Explore multimodal & small language models, plus advanced benchmarks for AI evaluation. Microsoft researchers are working on breakthroughs in weather prediction, materials design, even a new kind of computer for AI inference and hard optimization problems.
Eureka: Evaluating and understanding progress in AI
| Vidhisha Balachandran, Jingya Chen, Neel Joshi, Besmira Nushi, Hamid Palangi, Eduardo Salinas, Vibhav Vineet, James Woffinden-Luey, and Safoora Yousefi
How can we rigorously evaluate and understand state-of-the-art progress in AI? Eureka is an open-source framework for standardizing evaluations of large foundation models, beyond single-score reporting and rankings. Learn more about the extended findings.
Direct Nash Optimization: Teaching language models to self-improve with general preferences
This talk discusses teaching language models to self-improve using a preference oracle like GPT-4, framing it as a two-player game to find an optimal policy at a Nash equilibrium, and achieving state-of-the-art win rates against GPT-4 Turbo on benchmarks such…
In this episode, learn about the latest multimodal AI models, advanced benchmarks for AI evaluation and model self-improvement, and an entirely new kind of computer for AI inference and hard optimization. Discover how these research breakthroughs and more can help…
Tracing the path to self-adapting AI agents
| Ching-An Cheng, Adith Swaminathan, and Allen Nie
Introducing Trace, Microsoft and Stanford University’s novel AI optimization framework, now available as a Python library. Trace adapts dynamically and optimizes a wide range of applications from language models to robot control.
Abstracts: July 18, 2024
| Gretchen Huizinga and Arindam Mitra
Senior Researcher Arindam Mitra introduces AgentInstruct. Using raw data sources, the automated multi-agent framework can create diverse, high-quality synthetic data at scale for the post-training of small and large language models.
In the news | Microsoft News Center
Why AI sometimes gets it wrong — and big strides to address it
Around the time GPT-4 was making headlines for acing standardized tests, Microsoft researchers and collaborators were putting other AI models through a different type of test — one designed to make the models fabricate information.
Introducing AutoGen Studio: A low-code interface for building multi-agent workflows
| Victor Dibia, Gagan Bansal, Jingya Chen, Suff Syed, Adam Fourney, Erkang (Eric) Zhu, Chi Wang, and Saleema Amershi
AutoGen Studio, built on Microsoft’s flexible open-source AutoGen framework for orchestrating AI agents, provides an intuitive user-friendly interface that enables developers to rapidly build, test, customize, and share multi-agent AI solutions—with little or no coding.