Microsoft Research Blog

English

  1. Text2Arch: A Dataset for Generating Scientific Architecture Diagrams from Natural Language Descriptions 

    February 1, 2026 | Shivank Garg, Sankalp Mittal, and Manish Gupta

    Communicating complex system designs or scientific processes through text alone is inefficient and prone to ambiguity. A system that automatically generates scientific architecture diagrams from text with high semantic fidelity can be useful in multiple applications like enterprise architecture visualization, AI-driven software design, and educational…

  2. TrustGen: A Platform of Dynamic Benchmarking on the Trustworthiness of Generative Foundation Models 

    February 1, 2026 | TrustGen Team and Jianfeng Gao

    Generative foundation models (GenFMs), such as large language models and text-to-image systems, have demonstrated remarkable capabilities in various downstream applications. As they are increasingly deployed in high-stakes applications, assessing their trustworthiness has become both a critical necessity and a substantial challenge. Existing evaluation efforts are…

  3. Synergizing Understanding and Generation with Interleaved Analyzing-Drafting Thinking 

    February 1, 2026

    Unified Vision–Language Models (UVLMs) aim to advance multimodal learning by supporting both understanding and generation within a single framework. However, existing approaches largely focus on architectural unification while overlooking the need for explicit interaction between the two capabilities during task solving. As a result, current…

  4. Aurelius: Relation Aware Text-to-Audio Generation At Scale 

    February 1, 2026

    We present Aurelius, a new framework that enables relation aware text-toaudio (TTA) generation research at scale. Given the lack of essential audio event and relation corpora, Aurelius contributes a large-scale audio event corpus AudioEventSet and another large-scale relation corpus AudioRelSet. Comprising 110 event categories, AudioEventSet…

  5. RedCodeAgent: Automatic Red-teaming Agent against Diverse Code Agents 

    February 1, 2026

    Code agents have gained widespread adoption due to their strong code generation capabilities and integration with code interpreters, enabling dynamic execution, debugging, and interactive programming capabilities. While these advancements have streamlined complex workflows, they have also introduced critical safety and security risks. Current static safety…

  6. Score Distillation Beyond Acceleration: Generative Modeling from Corrupted Data 

    February 1, 2026

    Learning generative models directly from corrupted observations is a long-standing challenge across natural and scientific domains. We introduce *Distillation from Corrupted Data (DCD)*, a unified framework for learning high-fidelity, one-step generative models using **only** degraded data of the form where the mapping may be the…

  7. OrbitalBrain: A Distributed Framework For Training ML Models in Space 

    February 1, 2026

    Earth observation nanosatellites capture high-resolution photos of the Earth in near real-time. These images increasingly support ML applications that are critical for safety and response, such as forest fire and flood detection. However, the downlink bandwidth is limited, resulting in days or weeks of delay…

  8. Core-Set Selection for Data-efficient Land Cover Segmentation 

    January 30, 2026

    The increasing accessibility of remotely sensed data and the potential of such data to inform large-scale decision-making has driven the development of deep learning models for many Earth Observation tasks. Traditionally, such models must be trained on large datasets. However, the common assumption that broadly…