MedFuzz: Exploring the robustness of LLMs on medical challenge problems
Medfuzz tests LLMs by breaking benchmark assumptions, exposing vulnerabilities to bolster real-world accuracy.
Medfuzz tests LLMs by breaking benchmark assumptions, exposing vulnerabilities to bolster real-world accuracy.
GraphRAG uses LLM-generated knowledge graphs to substantially improve complex Q&A over retrieval-augmented generation (RAG). Discover automatic tuning of GraphRAG for new datasets, making it more accurate and relevant.
Researchers and their collaborators are drawing inspiration from the brain to develop more sustainable AI models. Projects like CircuitNet and CPG-PE improve performance and energy efficiency by mimicking the brain's neural patterns.
Learn what’s next for AI at Research Forum on Sept. 3; WizardArena simulates human-annotated chatbot games; MInference speeds pre-filling for long-context LLMs via dynamic sparse attention; Reef: Fast succinct non-interactive zero-knowledge regex proofs.
In this issue: Research Forum Ep. 4 explores multimodal AI. Registration is now open; Surveying developers’ AI needs; SuperBench improves cloud AI infrastructure reliability; Virtual Voices: Exploring factors influencing participation in virtual meetings.
Microsoft researchers collaborated to release new pathology foundation models. Their report shows models benefit from diverse data, increased model size, and specialized algorithms to enhance the accuracy and applicability of cancer diagnosis and treatment.
Designed for interactive storytelling in games, GENEVA lets users explore narrative paths and adapt stories to diverse contexts. It uses LLMs to generate and visualize branching narratives from high-level descriptions, representing them as graphs.
Integrating LLMs in video game development can create dynamic and interactive narratives. By involving players in the narrative design process, LLMs can generate unique player-driven strategies and provide valuable feedback for game designers.
In this issue: Skeleton Posterior-guided OpTimization (SPOT) exhibits potential in various causal discovery tasks; Using visual imagery for an EEG-based brain–computer interface; Developing human-centered AI systems to assist creative professionals.
Introducing Trace, Microsoft and Stanford University's novel AI optimization framework, now available as a Python library. Trace adapts dynamically and optimizes a wide range of applications from language models to robot control.
The competitive dynamics of AI agents and a method for learning and applying temporal action abstractions represent just some of Microsoft’s contributions to ICML 2024.
Advancing time series analysis with multi-granularity guided diffusion model; An algorithm-system co-design for fast, scalable MoE inference; What makes a search metric successful in large-scale settings; learning to solve PDEs without simulated data.
Meet our community of researchers, learn about exciting research topics, and grow your network
Ongoing conversations at the cutting edge of research
Join us for a continuous exchange of ideas about research in the era of general AI