CalcLM: Agent Grid
Prototype experimenting with agents in a grid UI — simple, open experiment that illustrates how agent workflows might live in a grid-like surface.
Magentic Marketplace
Magentic Marketplace is an open-source simulation environment for exploring the numerous possibilities of agentic markets and their societal implications at scale. It provides a foundation for studying these markets and guiding them toward outcomes that benefit everyone.
MMCTAgent
MMCTAgent (Multi-modal Critical Thinking Agent) is a state-of-the-art multi-modal AI framework that brings human-like critical thinking to visual reasoning tasks. it combines advanced planning, self-critique, and tool-based reasoning to deliver superior performance in complex image…
MMCTAgent: Enabling multimodal reasoning over large video and image collections
MMCTAgent enables dynamic multimodal reasoning with iterative planning and reflection. Built on Microsoft’s AutoGen framework, it integrates language, vision, and temporal understanding for complex tasks like long video and image analysis.
BlueCodeAgent: A blue teaming agent enabled by automated red teaming for CodeGen AI
BlueCodeAgent is an end-to-end blue-teaming framework built to boost code security using automated red-teaming processes, data, and safety rules to guide LLMs’ defensive decisions. Dynamic testing reduces false positives in vulnerability detection.
ODSP Applied Science
Shaping the future of content and collaboration in OneDrive and SharePoint The ODSP Applied Science team is shaping the future of content and collaboration by transforming OneDrive and SharePoint into a planet-scale, AI-native knowledge substrate,…
Adapting Web Agents with Synthetic Supervision
Introducing Magentic Marketplace, an open-source simulation environment for studying agentic markets
This video gives a brief introduction to Magentic Marketplace, an open-source platform for simulating agent-based markets. You’ll learn about its main features, how to start a simulation using the command line with built-in datasets, and…