Microsoft Research Blog

Research Blog

  1. A figure that illustrates the concept of the version space in a bandit example. It is a 2D plot where the x-axis denotes actions, and the y-axis denotes reward. It shows data of sampled reward values of different actions as dots, and different hypotheses of how reward depends on action as a function. The functions that are consistent with the observed data form the version space.

    A game-theoretic approach to provably correct and scalable offline RL 

    September 7, 2022 | Ching-An Cheng, Tengyang Xie, and Nan Jiang

    Despite increasingly widespread use of machine learning (ML) in all aspects of our lives, a broad class of scenarios still rely on automation designed by people, not artificial intelligence (AI). In real-world applications that involve making sequences of decisions with long-term consequences, from allocating beds…

  2. A montage of four animated figures completing humanoid actions: standing up, walking, running, and jumping.

    MoCapAct: Training humanoid robots to “Move Like Jagger” 

    August 25, 2022

    What would it take to get humanoid, bipedal robots to dance like Mick Jagger? Indeed, for something more mundane, what does it take to get them to simply stand still? Sit down? Walk? Move in myriads of other ways many people take for granted? Bipedalism…

  3. confidential computing hero

    Confidential Containers: Verifiably secure computation in the cloud 

    July 18, 2022 | Sean T. Allen

    For many organizations, trusting their data to the cloud requires having a complete understanding of and control over the environment in which that data resides and how it’s being processed. Microsoft understands this, and we are committed to building a trustworthy cloud—one in which security,…

  4. Diagram showing GODEL’s architecture. The environment of the dialog system consists of both structured and unstructured content, which it uses to retrieve information. This source content, which we term “grounding,” is updated and repeatedly used by GODEL to produce a new response after each user input.

    GODEL: Combining goal-oriented dialog with real-world conversations 

    June 23, 2022

    They make restaurant recommendations, help us pay bills, and remind us of appointments. Many people have come to rely on virtual assistants and chatbots to perform a wide range of routine tasks. But what if a single dialog agent, the technology behind these language-based apps,…