Microsoft Research Blog

Artificial intelligence

  1. Joint Prompt Optimization of Stacked LLMs using Variational Inference 

    October 1, 2023

    We view large language models (LLMs) as stochastic language layers in a network, where the learnable parameters are the natural language prompts at each layer. We stack two such layers, feeding the output of one layer to the next. We call the stacked architecture a…

  2. Grounded Copilot: How Programmers Interact with Code-Generating Models 

    October 1, 2023 | Shraddha Barke, Michael James, and Nadia Polikarpova

    Powered by recent advances in code-generating models, AI assistants like Github Copilot promise to change the face of programming forever. But what is this new face of programming? We present the first grounded theory analysis of how programmers interact with Copilot, based on observing 20…

  3. ColDeco: An End User Spreadsheet Inspection Tool for AI-Generated Code 

    October 1, 2023

    Code-generating large language models (LLMs) are transforming programming. Their capability to generate multi-step solutions provides even non-programmers a mechanism to harness the power of programming. Non-programmers typically use spreadsheets to manage tabular data, as it offers an intuitive understanding of data manipulation and formula outcomes.…

  4. diagram

    EmFore: Online Learning of Email Folder Classification Rules 

    October 1, 2023

    Modern email clients support predicate-based folder assignment rules that can automatically organize emails. Unfortunately, users still need to write these rules manually. Prior machine learning approaches have framed automatically assigning email to folders as a classification task and do not produce symbolic rules. Prior inductive…

  5. A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models 

    September 20, 2023 | Haoran Xu, Young Jin Kim, Amr Sharaf, and Hany Hassan Awadalla

    Generative Large Language Models (LLMs) have achieved remarkable advancements in various NLP tasks. However, these advances have not been reflected in the translation task, especially those with moderate model sizes (i.e., 7B or 13B parameters), which still lag behind conventional supervised encoder-decoder translation models. Previous…

  6. Data Formulator: AI-powered Concept-driven Visualization Authoring 

    September 18, 2023 | Chenglong Wang, John Thompson, and Bongshin Lee

    With most modern visualization tools, authors need to transform their data into tidy formats to create visualizations they want. Because this requires experience with programming or separate data processing tools, data transformation remains a barrier in visualization authoring. To address this challenge, we present a…

  7. Textbooks Are All You Need II: phi-1.5 technical report 

    September 11, 2023

    We continue the investigation into the power of smaller Transformer-based language models as initiated by \textbf{TinyStories} -- a 10 million parameter model that can produce coherent English -- and the follow-up work on \textbf{phi-1}, a 1.3 billion parameter model with Python coding performance close to…