Deep Reinforcement Learning for Games
Return to Microsoft Research Lab - Cambridge

Deep Reinforcement Learning for Games


News & features

News & features

News & features

News & features


With over two billion players in the world, AI is poised to transform the landscape of gaming experiences and the games industry itself. Microsoft’s vision for gaming is a world where players are empowered to play the games they want, with the people they want, whenever they want, where-ever they are, and on any device. As part of the Machine Intelligence theme, and in close collaboration with the Xbox Gaming division, we drive towards this transformation through world-leading deep reinforcement learning research.


Minecraft screenshot

Project Malmo

Our research on multi-agent learning aims to develop intelligent agents that can collaborate with people, in applications ranging from video games to assistive technology. As we endeavour to unravel the principles of multi-agent learning and collaboration, our research is facilitated by the Project Malmo, our open-source experimentation platform built on the game Minecraft.

Project Paidia - game intelligence round robot character

Project Paidia

The focus of Project Paidia is to drive state of the art research in reinforcement learning to enable novel applications in modern video games, in particular: agents that learn to collaborate with human players.


Featured collaboration

Oxford University logo

MSR PI: Katja Hofmann
University of Oxford PI: Shimon Whiteson
Joint Postdoctoral Researcher: Mingfei Sun

Reinforcement Learning for Gaming

Mingfei Sun portrait

Mingfei Sun

This project will focus on developing and analysing state-of-the-art reinforcement learning (RL) methods for application to video games.  The project aims to tackle two key challenges.  First, building effective game AI with RL requires dramatically scaling up existing tools for cooperative multi-agent RL, in which teams of agents must collaborate to complete tasks.  Doing so requires new methods for performing multi-agent credit assignment and multi-agent exploration in large state and action spaces.  Second, effective game AI must also be able to transfer effectively to new scenarios, such as new game levels and versions, without having to learn from scratch.  Doing so requires new methods for transfer and meta-learning in RL that scale to the complexity of modern video games.

Industry collaborators

Ninja Theory logo

Ninja Theory was formed in 2004 by four partners, including current Directors Nina Kristensen (Chief Development Director), Tameem Antoniades (Chief Creative Director) and Jez San OBE (Non-Executive Director). The studio pride themselves on striving for the highest production values and continually pushing the boundaries of technology, art and design to create evermore exciting video game experiences.

Find out more about our collaboration with Ninja Theory on the Project Paidia page >

IGGI logo

Industry Partner and Advisory Board Member of the IGGI Centre for Doctoral Training


Academic Collaborations

Learning to Collaborate with Human Players
Katja Hofmann (MSR Cambridge), Sam Devlin (MSR Cambridge), Kamil Ciosek (MSR Cambridge), Professor Anca Dragan (BAIR), Micah Carroll (PhD student)

Find out more on our Berkeley AI Research collaboration page >

Malmo 2020 Multi-Agent Upgrade
Diego Perez Liebana
Queen Mary University London
Microsoft’s Project Malmo platform enables users to create worlds and learning agents able to play multiple 3D games within Minecraft. In recent years, we have co-organised two international competitions. First on multi-agent learning and, secondly, on sample efficient reinforcement learning with human priors . These competitions have extended the features of the platform, but each introduced their own API, installation instructions and documentation, which has created an unnecessary barrier to researchers wanting to get started with the platform. The objective of this project is to unify the extensions from both competitions back into the original Malmo benchmark, to provide a common entry point for researchers.

Sponsored PhDs

Reinforcement Learning for Enabling Next Generation Human-Machine Partnerships
Max Planck Institute for Software Systems
MSR Supervisor:
Sam Devlin
External Supervisor: Adish Singla

Local Forward Model Learning for Sample-Efficient Sequential Decision Making in Open-World 3D Games
Queen Mary University
MSR Supervisor: Sam Devlin
External Supervisor: Diego Perez Liebana

Deep Reinforcement Learning For Collaborative Game AI To Enhance Player Experience
University of York
MSR Supervisor: Sam Devlin
External Supervisor: TBC

Better Sample Efficiency of Reinforcement Learning
University of Edinburgh
MSR Supervisor: Kamil Ciosek
External Supervisor: Amos Storkey

Reinforcement Learning for Adaptive User Interaction
University of Oxford
MSR Supervisor: Katja Hofmann
External Supervisor: Shimon Whiteson

Intrinsically Motivated Exploration for Lifelong Deep Reinforcement Learning of Multiple Tasks
MSR Supervisor: Katja Hofmann
External Supervisor: Pierre-Yves Oudeyer

Talks & Workshops