Microsoft Research Blog

Artificial intelligence

  1. DiSK: A Diffusion Model for Structured Knowledge 

    December 8, 2023 | Ouail Kitouni, Niklas Nolte, James Hensman, and Bhaskar Mitra

    Structured (dictionary-like) data presents challenges for left-to-right language models, as they can struggle with structured entities for a wide variety of reasons such as formatting and sensitivity to the order in which attributes are presented. Tabular generative models suffer from a different set of limitations…

  2. Axiomatic Preference Modeling for Longform Question Answering 

    December 6, 2023

        The remarkable abilities of large language models (LLMs) like GPT-4 partially stem from post-training processes like Reinforcement Learning from Human Feedback (RLHF) involving human preferences encoded in a reward model. However, these reward models (RMs) often lack direct knowledge of why, or under…

  3. MatterGen: a generative model for inorganic materials design 

    December 6, 2023

    The design of functional materials with desired properties is essential in driving technological advances in areas like energy storage, catalysis, and carbon capture. Generative models provide a new paradigm for materials design by directly generating entirely novel materials given desired property constraints. Despite recent progress,…

  4. Promoting Topic Coherence and Inter-Document Consorts in Multi-Document Summarization via Simplicial Complex and Sheaf Graph 

    December 1, 2023 | Yash Kumar Atri, Arun Iyer, Tanmoy Chakraborty, and Vikram Goyal

    Multi-document Summarization (MDS) characterizes compressing information from multiple source documents to its succinct summary. An ideal summary should encompass all topics and accurately model cross-document relations expounded upon in the source documents. However, existing systems either impose constraints on the length of tokens during the…

  5. TaskWeaver: A Code-First Agent Framework 

    December 1, 2023

    Large Language Models (LLMs) have shown impressive abilities in natural language understanding and generation, leading to their use in applications such as chatbots and virtual assistants. However, existing LLM frameworks face limitations in handling domain-specific data analytics tasks with rich data structures. Moreover, they struggle…

  6. Training Private and Efficient Language Models with Synthetic Data from LLMs 

    December 1, 2023

    Language models are pivotal in modern text-based applications, offering many productivity features like next-word prediction, smart composition, and summarization. In many applications, these models must be lightweight to meet inference time and computational cost requirements. Furthermore, due to the inherent sensitivity of their training data,…

  7. Calibrated Language Models Must Hallucinate 

    November 24, 2023 | Adam Tauman Kalai and Santosh S. Vempala

    Recent language models have a mysterious tendency to generate false but plausible-sounding text. Such "hallucinations" are an obstacle to the usability of language-based AI systems and can harm people who rely upon their outputs. This work shows shows that there is an inherent statistical reason…

  8. Positional Description Matters for Transformers Arithmetic 

    November 22, 2023

    Transformers, central to the successes in modern Natural Language Processing, often falter on arithmetic tasks despite their vast capabilities, which paradoxically include remarkable coding abilities. We observe that a crucial challenge is their naive reliance on positional information to solve arithmetic problems with a small…