This event has now concluded.
Thursday, January 14, 2021
| Time (EST) | Session | Speaker | |
| 10:00 AM-10:15 AM | Welcome Remarks | Akshay Krishnamurthy (opens in new tab), Microsoft Research | |
| 10:15 AM-11:00 AM | New Advances in Hierarchical Reinforcement Learning | Doina Precup (opens in new tab), McGill University | |
| 11:00 AM-11:45 AM | Reinforcement Learning Debate: The State of RL and The Theory-Practice Divide | John Langford (opens in new tab), Microsoft Research
Yoshua Bengio (opens in new tab), Mila (Quebec AI Institute) |
|
| 11:45 AM-12:15 PM | Break | ||
| 12:15 PM-1:45 PM | Virtual Poster Presentations | ||
| Instance-Dependent Complexity of Contextual Bandits and Reinforcement Learning: A Disagreement-Based Perspective | Yunzong Xu, MIT | ||
| Taylor Expansion Policy Optimization | Yunhao Tang, Columbia University | ||
| Provably Efficient Policy Optimization with Thompson Sampling | Haque Ishfaq, McGill University | ||
| Active Imitation Learning with Noisy Guidance | Kianté Brantley, University of Maryland | ||
| Finite-Time Analysis of Decentralized Stochastic Approximation with Applications in Multi-Agent and Multi-Task Learning | Sihan Zeng, Georgia Tech | ||
| META-Q-LEARNING | Rasool Fakoor, Amazon Web Services | ||
| Toward the Fundamental Limits of Imitation Learning | Nived Rajaraman, UC Berkeley | ||
| Multitask Bandit Learning through Heterogeneous Feedback Aggregation | Zhi Wang, UC San Diego | ||
| “It’s Unwieldy and it Takes a Lot of Time.” Challenges and Opportunities for Creating Agents in Commercial Games | Mikhail Jacob, Microsoft Research, Cambridge UK | ||
| A Framework for Robust Learning and Control of Nonlinear Systems with Large Uncertainty | Hoang Le, Microsoft Research, Redmond | ||
| Learning Dynamic Belief Graphs to Generalize on Text-Based Games | Eric Yuan, Microsoft Research, Montreal | ||
| Frugal Optimization for Cost-Related Hyperparameters | Qingyun Wu, Microsoft Research, NYC | ||
| Image Augmentation Is All You Need: Regularizing Deep Reinforcement Learning from Pixels | Denis Yarats, New York University | ||
| Self Supervised Policy Adaptation During Deployment | Nicklas Hansen, Technical University of Denmark | ||
| Multi-Task Reinforcement Learning with Soft Modularization | Ruihan Yang, UC San Diego | ||
| Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning | Rishabh Agarwal, Google Research, and Mila Research | ||
| A Regret Minimization Approach to Iterative Learning Control | Karan Singh, Princeton University | ||
| RMP2: A Differentiable Policy Class for Robotic Systems with Control-Theoretic Guarantees | Anqi Li, University of Washington | ||
| Generating Adversarial Disturbances for Controller Verification | Udaya Ghai, Princeton University |