Research Focus: Week of May 7, 2025
In this issue: New research on compound AI systems and causal verification of the Confidential Consortium Framework; release of Phi-4-reasoning; enriching tabular data with semantic structure, and more.
In this issue: New research on compound AI systems and causal verification of the Confidential Consortium Framework; release of Phi-4-reasoning; enriching tabular data with semantic structure, and more.
Developers spend a lot of time debugging code. Learn how debug-gym can equip AI agents to help, enabling them to set breakpoints, navigate the codebase, and print runtime variable values on demand, so they better understand the code and its execution flow.
In this issue, we examine a new conversation segmentation method that delivers more coherent and personalized agent conversation, and we review efforts to improve MLLMsā understanding of geologic maps. Check out the latest research and other updates.
In this issue: A new approach to multimodal pretraining for remote sensing; Managed-retention memory for the AI era; Improving detection of macular telangiectasia type 2; Generalizing symbolic automata.
In this edition: Privacy enhancements for multiparty deep learning; using smaller, open-source models to provide relevance judgments; new tool uses AI, data to automate innovation and development; Yasuyuki Matsushita named IEEE 2025 Computer Society Fellow.
AIOpsLab is an open-source framework designed to evaluate and improve AI agents for cloud operations, offering standardized, scalable benchmarks for real-world testing, enhancing cloud system reliability.
Holistic motion-capture calibration technique without calibration, manual intervention or custom hardware; Research on AI agents for autonomous clouds; Automating proof-oriented program construction; One-to-many testing for natural language code generation.
New Research | FLASH: Workflow automation agent for diagnosing recurring incidents; METAREFLECTION: Learning instructions for language agents using past reflections; Boosting LLM training efficiency through faster communication between GPUs; and more.
Simplifying secure decision tree training; Improving accuracy of audio content detection; A novel neurosymbolic system for converting text to tables; New video series: AI for Business Transformation; TEE security protections for container workloads.
Investigating vulnerabilities in LLMs; A novel total-duration-aware (TDA) duration model for text-to-speech (TTS); Generative expert metric system through iterative prompt priming; Integrity protection in 5G fronthaul networks:
RUBICON evaluates AI-driven conversations and improves their quality by learning detailed domain-specific rubrics from minimal data. It gathers insights on AI assistant performance while maintaining user privacy and data security.
MicroCode offers an affordable way to program the BBC micro:bit without needing an internet connection, fostering exploratory learning.
Meet our community of researchers, learn about exciting research topics, and grow your network
Ongoing conversations at the cutting edge of research
Join us for a continuous exchange of ideas about research in the era of general AI