Microsoft Research Blog

Artificial intelligence

  1. UFO: A UI-Focused Agent for Windows OS Interaction 

    April 1, 2025

    We introduce UFO, an innovative UI-Focused agent to fulfill user requests tailored to applications on Windows OS, harnessing the capabilities of GPT-Vision. UFO employs a dual-agent framework to meticulously observe and analyze the graphical user interface (GUI) and control information of Windows applications. This enables…

  2. TeCoFeS: Text Column Featurization using Semantic Analysis 

    April 1, 2025

    Extracting insights from text columns can be challenging and time-intensive. Existing methods for topic modeling and feature extraction are based on syntactic features and often overlook the semantics. We introduce the semantic text column featurization problem, and present a scalable approach for automatically solving it.…

  3. Summary of the paper and images of the active and passive visualisations.

    Are We On Track? AI-Assisted Active and Passive Goal Reflection During Meetings 

    April 1, 2025

    Meetings often suffer from a lack of intentionality, such as unclear goals and straying off-topic. Identifying goals and maintaining their clarity throughout a meeting is challenging, as discussions and uncertainties evolve. Yet meeting technologies predominantly fail to support meeting intentionality. AI-assisted reflection is a promising…

  4. OG-RAG: Ontology-Grounded Retrieval-Augmented Generation For Large Language Models 

    April 1, 2025 | Peeyush Kumar and Kartik Sharma

    This paper presents OG-RAG, an Ontology-Grounded Retrieval Augmented Generation method designed to enhance LLM-generated responses by anchoring retrieval processes in domain-specific ontologies. While LLMs are widely used for tasks like question answering and search, they struggle to adapt to specialized knowledge, such as industrial workflows…

  5. Execution-guided within-prompt search for programming-by-example 

    April 1, 2025

    Large language models (LLMs) can generate code from examples without being limited to a DSL, but they lack search, as sampled programs are independent. In this paper, we use an LLM as a policy that generates lines of code and then join these lines of…

  6. Inference-Time Scaling for Complex Tasks: Where We Stand and What Lies Ahead 

    March 31, 2025

    Inference-time scaling can enhance the reasoning capabilities of large language models (LLMs) on complex problems that benefit from step-by-step problem solving. Although lengthening generated scratchpads has proven effective for mathematical tasks, the broader impact of this approach on other tasks remains less clear. In this…

  7. Evidence Aggregator: AI reasoning applied to rare disease diagnostics 

    March 13, 2025

    Retrieving, reviewing, and synthesizing technical information can be time-consuming and challenging, particularly when requiring specialized expertise, as is the case of variant assessment for rare disease diagnostics. To address this challenge, we developed the Evidence Aggregator (EvAgg), a generative AI tool designed for rare disease…

  8. Societal AI: Research Challenges and Opportunities 

    March 1, 2025 | Beibei Shi, Haotian Li, Xing Xie, and Societal AI Team

    Artificial intelligence is reshaping society at an unprecedented scale, influencing key domains such as education, labor, governance, and scientific discovery. As AI models, particularly large language models, become more capable and autonomous, their societal impact raises urgent questions regarding fairness, interpretability, alignment with human values,…

  9. What Makes a Good Diffusion Planner for Decision Making? 

    March 1, 2025 | Haofei Lu, Dongqi Han, Yifei Shen, and Dongsheng Li

    Diffusion models have recently shown significant potential in solving decision-making problems, particularly in generating behavior plans -- also known as diffusion planning. While numerous studies have demonstrated the impressive performance of diffusion planning, the mechanisms behind the key components of a good diffusion planner remain…

  10. The future of the industrial AI edge is cellular 

    February 26, 2025 | Xenofon Foukas and Bozidar Radunovic

    Ensuring reliable and high-bandwidth wireless connectivity and local processing at the edge are crucial enablers for emerging industrial AI applications. In this work, we argue that the recent trends in cellular networking make the technology the ideal connectivity solution for these applications, due to its…