Microsoft Research Blog

Optimistic Actor Critic avoids the pitfalls of greedy exploration in reinforcement learning

November 26, 2019 | Kamil Ciosek

One of the core directions of Project Malmo is to develop AI capable of rich interactions. Whether that means learning new skills to apply to challenging problems, understanding complex environments, or knowing when to enlist the help of humans, reinforcement…

In the news | Nature

AI takes on popular Minecraft game in machine-learning contest

November 26, 2019

To see the divide between the best artificial intelligence and the mental capabilities of a seven-year-old child, look no further than the popular video game Minecraft. A young human can learn how to find a rare diamond in the game…

In the news | InfoQ

Microsoft Releases DialogGPT AI Conversation Model

November 26, 2019

Microsoft Research's Natural Language Processing Group released the dialogue generative pre-trained transformer (DialoGPT), a pre-trained deep-learning natural language processing (NLP) model for automatic conversation response generation. The model was trained on over 147M dialogues and achieves state-of-the-art results on several benchmarks.

In the news | InformationAge

How can organisations use data effectively, according to corporate VP at Microsoft Azure

November 26, 2019

The largest enterprises have amassed an incredible amount of data from a variety of sources. But, the problem is identifying that data, accessing it and using that most precious asset effectively.

Graphic showing the components of the Icebreaker model

Microsoft Research Blog

Icebreaker: New model with novel element-wise information acquisition method reduces cost and data needed to train machine learning models

November 25, 2019 | Cheng Zhang and Sebastian Tschiatschek

In many real-life scenarios, obtaining information is costly, and getting fully observed data is almost impossible. For example, in the recruiting world, obtaining relevant information (in other words, a feature value) for a company could mean performing time-consuming interviews. The…

Image showing rectangles of various sizes passing through a magic door and becoming same size to depict logarithmic mapping.

Microsoft Research Blog

Logarithmic mapping allows for low discount factors by creating action gaps similar in size

November 21, 2019 | Harm van Seijen, Mehdi Fatemi, and Arash Tavakoli

While reinforcement learning (RL) has seen significant successes over the past few years, modern deep RL methods are often criticized for how sensitive they are with respect to their hyper-parameters. One such hyper-parameter is the discount factor, which controls how…

Microsoft Research Podcast

Program synthesis and the art of programming by intent with Dr. Sumit Gulwani

November 20, 2019

Dr. Sumit Gulwani is a programmer’s programmer. Literally. A Partner Research Manager in the Program Synthesis, or PROSE, group at Microsoft Research, Dr. Gulwani is a leading researcher in program synthesis and the inventor of many intent-understanding, programming-by-example and programming-by-natural…

In the news | Search Engine Journal

Bing is Now Utilizing BERT at a Larger Scale Than Google

November 19, 2019

Bing revealed today that it has been using BERT in search results before Google, and it’s also being used at a larger scale. Google’s use of BERT in search results is currently affecting 10% of search results in the US,…

In the news | Search Engine Land

Bing says it has been applying BERT since April

November 19, 2019

Bing has been using BERT to improve the quality of search results since April, Microsoft has stated. The transformer models are now applied to every Bing query globally.