Microsoft Research Blog

English

AutoAdapt: An Automated Domain Adaptation Framework for LLMs

March 1, 2026

Large language models (LLMs) excel in open domains but struggle in specialized settings with limited data and evolving knowledge. Existing domain adaptation practices rely heavily on manual trial-and-error processes, incur significant hyperparameter complexity, and are highly sensitive to data and user preferences, all under the…
StreamWise: Serving Multi-Modal Generation in Real-Time at Scale

March 1, 2026

Advances in multi-modal generative models are enabling new applications, from storytelling to automated media synthesis. Most current workloads generate simple outputs (e.g., image generation from a prompt) in batch mode, often requiring several seconds even for basic results. Serving real-time multi-modal workflows at scale is…
Can Vision Language Models Assess Graphic Design Aesthetics? A Benchmark, Evaluation, and Dataset Perspective

March 1, 2026

Assessing the aesthetic quality of graphic design is central to visual communication, yet remains underexplored in vision language models (VLMs). We investigate whether VLMs can evaluate design aesthetics in ways comparable to humans. Prior work faces three key limitations: benchmarks restricted to narrow principles and…
MSCCL++: Rethinking GPU Communication Abstractions for AI Inference

March 1, 2026

AI applications increasingly run on fast-evolving, heterogeneous hardware to maximize performance, but general-purpose libraries lag in supporting these features. Performance-minded programmers often build custom communication stacks that are fast but error-prone and non-portable. This paper introduces MSCCL++, a design methodology for developing high-performance, portable communication…
Memora: A Harmonic Memory Representation Balancing Abstraction and Specificity

March 1, 2026

Agent memory systems must accommodate continuously growing information while supporting efficient, context-aware retrieval for downstream tasks. Abstraction is essential for scaling agent memory, yet it often comes at the cost of specificity, obscuring the fine-grained details required for effective reasoning. We introduce Memora, a harmonic…
Texterial: A Text-as-Material Interaction Paradigm for LLM-Mediated Writing

February 28, 2026

What if text could be sculpted and refined like clay -- or cultivated and pruned like a plant? Texterial reimagines text as a material that users can grow, sculpt, and transform. Current generative-AI models enable rich text operations, yet rigid, linear interfaces often mask such…
KEEP: A KV-Cache-Centric Memory Management System for Efficient Embodied Planning

February 27, 2026

Memory-augmented Large Language Models (LLMs) have demonstrated remarkable capability for complex and long-horizon embodied planning. By keeping track of past experiences and environmental states, memory enables LLMs to maintain a global view, thereby avoiding repetitive exploration. However, existing approaches often store the memory as raw…
Reasoning-Driven Multimodal LLM for Domain Generalization

February 27, 2026

This paper addresses the domain generalization (DG) problem in deep learning. While most DG methods focus on enforcing visual feature invariance, we leverage the reasoning capability of multimodal large language models (MLLMs) and explore the potential of constructing reasoning chains that derives image categories to…
Multimodal Alignment Improves Generalizability of Genomic Biomarker Prediction in Computational Pathology

February 27, 2026

Computational pathology models that use digitized histopathology whole-slide images have the potential to become a cost-effective and scalable alternative to molecular assays for the prediction of genomic biomarkers, a key task in precision oncology. However, as new genomic biomarkers are discovered or quantified, large, labeled…
Hardware Realization and Implementation Security Evaluation of HQC, A NIST PQC Standard

February 26, 2026 | Sanjay Deshpande

Quantum computing is no longer a distant dream, its rapid progress is poised to revolutionize various fields from drug discovery to optimization. But this leap forward comes with a critical caveat: the pre-quantum public-key cryptographic algorithms that secure our digital infrastructure today, such as RSA…
GeoMind: A Multi-Agent Framework for Geospatial Decision Support

February 26, 2026 | Muhammad Sohail Danish

Rapid access to actionable geospatial insights is essential during disasters such as floods, wildfires, or earthquakes, where timely decisions can save lives and resources. In many scenarios, especially in low-resource settings or when GIS experts are not immediately available, policymakers, humanitarian responders, and other non-technical…
CORPGEN advances AI agents for real work

February 26, 2026

By mid-morning, a typical knowledge worker is already juggling a client report, a budget spreadsheet, a slide deck, and an email backlog, all interdependent and all demanding attention at once. For AI agents to be genuinely useful in that environment, they will need to operate…

No results