News & features
Rethinking imitation learning with Predictive Inverse Dynamics Models
| Pallavi Choudhury, Lukas Schäfer, Chris Lovett, Katja Hofmann, and Sergio Valcarcel Macua
This research looks at why Predictive Inverse Dynamics Models often outperform standard Behavior Cloning in imitation learning. By using simple predictions of what happens next, PIDMs reduce ambiguity and learn from far fewer demonstrations.
UniRG: Scaling medical imaging report generation with multimodal reinforcement learning
| Sheng Zhang, Flora Liu, Guanghui Qin, Mu Wei, and Hoifung Poon
AI can help generate medical image reports, but today’s models struggle with varying reporting schemes. Learn how UniRG uses reinforcement learning to boost performance of medical vision-language models.
In the news | Association for Computing Machinery
Madanlal Musuvathi named ACM Fellow
Madanlal was selected by his peers for the development of methods in concurrency verification and testing, and machine learning systems design.
Multimodal reinforcement learning with agentic verifier for AI agents
| Reuben Tan, Baolin Peng, Zhengyuan Yang, Oier Mees, and Jianfeng Gao
Argos improves multimodal RL by evaluating whether an agent’s reasoning aligns with what it observes over time. The approach reduces visual hallucinations and produces more reliable, data-efficient agents for real-world applications.
OptiMind: A small language model with optimization expertise
| Xinzhi Zhang, Zeyi Chen, Humishka Hope, Hugo Barbalho, Konstantina Mellou, Marco Molinaro, Janardhan (Jana) Kulkarni, Ishai Menache, and Sirui Li
OptiMind is a small language model that converts business operation challenges, described naturally, into mathematical formulations that optimization software can solve. It reduces formulation time & errors & enables fast, privacy-preserving local use.
In the news | Microsoft Research Blog
3D Telecommunications Goes Open Source
Microsoft Research is open-sourcing its cutting‑edge 3D Holoportation technology that was used to transform healthcare by bringing doctors and patients together across vast distances. The work highlights real‑time, life‑size 3D telepresence that allows clinicians to meet, consult, and plan care…
BlueCodeAgent: A blue teaming agent enabled by automated red teaming for CodeGen AI
| Chengquan Guo , Yuzhou Nie, Chulin Xie, Zinan Lin, Wenbo Guo, and Bo Li
BlueCodeAgent is an end-to-end blue-teaming framework built to boost code security using automated red-teaming processes, data, and safety rules to guide LLMs’ defensive decisions. Dynamic testing reduces false positives in vulnerability detection.
Awards | ACM SIGMICRO
Esha Choukse receives 2025 SIGMICRO Early Career Award
Choukse was recognized for her foundational contributions to hardware memory compression and to sustainable and efficient datacenter systems.
RedCodeAgent: Automatic red-teaming agent against diverse code agents
| Chengquan Guo , Chulin Xie, Yu Yang, Zhaorun Chen, Zinan Lin, Xander Davies, Yarin Gal, Dawn Song, and Bo Li
Code agents help streamline software development workflows, but may also introduce critical security risks. Learn how RedCodeAgent automates and improves “red-teaming” attack simulations to help uncover real-world threats that other methods overlook.