I am the team lead of the Reinforcement Learning team at Microsoft Research Maluuba, which focuses on fundamental challenges in reinforcement learning. Areas of research within reinforcement learning that I am currently very interested in are transfer learning, continual learning, hierarchical approaches, and multi-agent systems.
In our most recent project, we developed an approach to break down a complex task into many smaller ones, called the hybrid reward architecture. Using this architecture, we were able to achieve the highest possible score of 999,990 points on the challenging Atari 2600 game Ms. Pac-Man.
I did my PhD at the University of Amsterdam, under the supervision of Frans Groen and Shimon Whiteson. My thesis topic was “Reinforcement learning under space and time constraints”. After my PhD, I worked for 4 years as a postdoc in the RLAI group at the University of Alberta, working together with Richard Sutton on novel reinforcement-learning methods.