On Intrinsic Rewards and Continual Learning

  • Satinder Singh | University of Michigan, Ann Arbor

Continual learning is the problem faced by intelligent agents of the sort that people and other animals are, that of learning increasingly complex skills and knowledge over time from experience, of becoming increasingly competent over time regardless of their environment. In this talk I will break down the research needed to accomplish continual learning into multiple components and then present some results from my research group on a couple of the components. In the main part of the talk I will describe the optimal rewards framework and algorithms to learn rewards that help planning and learning agents followed by some theoretical results on the repeated inverse reinforcement learning problem. Time permitting, I will briefly describe a DeepRL architecture that can learn predictions that allow planning without the ability or the need to make the usual observation predictions made by traditional planning models.

Speaker Details

Satinder Singh is a Professor of Computer Science and Engineering at the University of Michigan, Ann Arbor, and the CTO and Chief Scientist of Cogitai, Inc. He has been the Chief Scientist at Syntek Capital, a venture capital company, a Principal Research Scientist at AT&T Labs, an Assistant Professor of Computer Science at the University of Colorado, Boulder, and a Postdoctoral Fellow at MIT’s Brain and Cognitive Science department. His research focus is on developing the theory, algorithms and practice of building artificial agents that can learn from interaction in complex, dynamic, and uncertain environments, including environments with other agents in them. His main contributions have been to the areas of reinforcement learning, multi-agent learning, and more recently to applications in cognitive science and healthcare. He is a Fellow of the AAAI (Association for the Advancement of Artificial Intelligence) and has coauthored more than 150 refereed papers in journals and conferences and has served on many program committee’s. He was Program-CoChair of AAAI 2017, and in 2013 helped cofound RLDM (Reinforcement Learning and Decision Making), a biennial multidisciplinary meeting that brings together computer scientists, psychologists, neuroscientists, roboticists, control theorists, and others interested in animal and artificial decision making.

Series: Microsoft Research Talks