Microsoft Research Blog

English

  1. background pattern

    Copilot Tuning Research 

    October 1, 2025

    The mission of the Copilot Tuning Research and Applied Science teams is to ensure that AI can be easily adapted and tuned to enable critical, high-value applications, including entirely new productivity tools. We are creating algorithms, architectures, data, and eval pipelines to adapt LLMs, reasoning…

  2. Detecting Compromise of Passkey Storage on the Cloud 

    October 1, 2025 | Mazharul Islam

    FIDO synced passkeys address account recovery challenges by enabling users to back up their FIDO2 private signing keys to the cloud storage of passkey management services (PMS). However, it introduces a serious security risk — attackers can steal users' passkeys through breaches of PMS's cloud…

  3. Leveraging Large Language Models to Generate Multiple-Choice Questions for Ophthalmology Education 

    October 1, 2025

    ImportanceĀ Ā Multiple choice questions (MCQs) are an important and integral component of ophthalmology residency training evaluation and board certification; however, high-quality questions are difficult and time-consuming to draft. ObjectiveĀ Ā To evaluate whether general-domain large language models (LLMs), particularly OpenAI’s Generative Pre-trained Transformer 4 (GPT-4), can reliably generate…

  4. Value Gradient Guidance for Flow Matching Alignment 

    October 1, 2025

    While methods exist for aligning flow matching models -- a popular and effective class of generative models -- with human preferences, existing approaches fail to achieve both adaptation efficiency and probabilistically sound prior preservation. In this work, we leverage the theory of optimal control and…

  5. Rationalized All-Atom Protein Design with Unified Multi-modal Bayesian Flow 

    October 1, 2025

    Designing functional proteins is a critical yet challenging problem due to the intricate interplay between backbone structures, sequences, and side-chains. Current approaches often decompose protein design into separate tasks, which can lead to accumulated errors, while recent efforts increasingly focus on all-atom protein design. However,…

  6. On the Hardness of Conditional Independence Testing In Practice 

    October 1, 2025

    Tests of conditional independence (CI) underpin a number of important problems in machine learning and statistics, from causal discovery to evaluation of predictor fairness and out-of-distribution robustness. Shah and Peters (2020) showed that, contrary to the unconditional case, no universally finite-sample valid test can ever…

  7. Disarming Strategic Text: Span-Aware Counterfactuals for Robust Content Moderation 

    October 1, 2025

    Machine learning systems deployed in the wild must operate reliably despite unreliable inputs, whether arising from distribution shifts, adversarial manipulation, or strategic behavior by users. Content moderation is a prime example: violators deliberately exploit euphemisms, obfuscations, or benign co-occurrence patterns to evade detection, creating unreliable…