Policy Optimization as Predictable Online Learning Problems: Imitation Learning and Beyond
- Ching-An Cheng | Institute for Robotics and Intelligent Machines, Georgia Tech
Efficient policy optimization is fundamental to solving real-world reinforcement learning problems, where agent-environment interactions can be costly. In this talk, I will discuss my recent research toward improving policy optimization efficiency from the perspective of online learning. The use of online learning to analyze policy optimization was pioneered by Ross et al. who proposed to reduce imitation learning to adversarial online learning problems. However, as I will discuss, this reduction actually loses information: the policy optimization problem is not truly adversarial but rather predictable from past information. Based on this observation, I will present conditions for the last-iterate convergence of value aggregation for imitation learning. Furthermore, I will show how one can leverage this predictable information to design better algorithms to speed up imitation learning and reinforcement learning.
View presentation slides here: https://www.microsoft.com/en-us/research/wp-content/uploads/2018/11/Imitation-Learning-and-Beyond-SLIDES.pdf
Speaker Details
Ching-An Cheng is a Robotics PhD student advised by Byron Boots at Institute for Robotics and Intelligent Machines, Georgia Tech. His research lies in the intersection between machine learning, optimization, and control theories. He is interested in developing theoretical foundations toward efficient and principled robot learning. His current focus concerns sample efficiency, structural properties, and uncertainties in robot learning. Specific topics include reinforcement learning, imitation learning, online learning, meta learning, large-scale Gaussian processes, and integrated motion planning and control. He receives Best Paper Award at AISTATS 2018 and Finalist for Best Systems Paper Award at RSS 2018.
-
-
Andrey Kolobov
Principal Research Manager
-
-
Series: Microsoft Research Talks
-
Decoding the Human Brain – A Neurosurgeon’s Experience
- Dr. Pascal O. Zinn
-
-
-
-
-
-
Challenges in Evolving a Successful Database Product (SQL Server) to a Cloud Service (SQL Azure)
- Hanuma Kodavalla,
- Phil Bernstein
-
Improving text prediction accuracy using neurophysiology
- Sophia Mehdizadeh
-
Tongue-Gesture Recognition in Head-Mounted Displays
- Tan Gemicioglu
-
DIABLo: a Deep Individual-Agnostic Binaural Localizer
- Shoken Kaneko
-
-
-
-
Audio-based Toxic Language Detection
- Midia Yousefi
-
-
From SqueezeNet to SqueezeBERT: Developing Efficient Deep Neural Networks
- Forrest Iandola,
- Sujeeth Bharadwaj
-
Hope Speech and Help Speech: Surfacing Positivity Amidst Hate
- Ashique Khudabukhsh
-
-
-
Towards Mainstream Brain-Computer Interfaces (BCIs)
- Brendan Allison
-
-
-
-
Learning Structured Models for Safe Robot Control
- Subramanian Ramamoorthy
-