

Harm van Seijen
Principal Research Manager
About
I am the team lead of the Reinforcement Learning team at Microsoft Research Montréal, which focuses on fundamental challenges in reinforcement learning. Areas of research within reinforcement learning that I am currently very interested in are transfer learning, continual learning, hierarchical approaches, and multi-agent systems.
In our most recent project, we developed an approach to break down a complex task into many smaller ones, called the hybrid reward architecture. Using this architecture, we were able to achieve the highest possible score of 999,990 points on the challenging Atari 2600 game Ms. Pac-Man.
I did my PhD at the University of Amsterdam, under the supervision of Frans Groen and Shimon Whiteson. My thesis topic was “Reinforcement learning under space and time constraints”. After my PhD, I worked for 4 years as a postdoc in the RLAI group…
Featured content

Hybrid Reward Architecture and the Fall of Ms. Pac-Man with Dr. Harm van Seijen
Episode 3, December 6, 2017 - If you’ve ever watched King of Kong: Fistful of Quarters, you know what a big deal it is to beat a video arcade game that was designed not to lose. Most humans can’t even come close. Enter Harm van Seijen, and a team of machine learning researchers from Microsoft Research Montreal. They took on Ms. Pac-man. And won. Today we’ll talk to Harm about his work in reinforcement learning, the inspiration for hybrid reward architecture, visit a few islands of tractability and get an inside look at the science behind the AI defeat of one of the most difficult video arcade games around. To find out more about Harm van Seijen and the groundbreaking work going on at Microsoft Research Montreal, visit Microsoft.com/research.