Online Learning and Bandits – Part 1

Date

August 10, 2015

Speaker

Aditya Gopalan

Affiliation

IISc

Overview

The ability to make continual, accurate decisions based on evolving data is key in many of today’s data-driven intelligent systems. This tutorial-style talk presents an introduction to the modern study of sequential learning and decision making under uncertainty. The broad objective is to cover modeling frameworks for online prediction and learning, explore algorithms for decision making, and gain an understanding of their performance. Specifically, we will look at multi-armed bandits- models of decision making that capture the explore-vs-exploit tradeoff in learning, regret minimization, non-stochastic or adversarial online learning, and online convex optimization. Time permitting, we will discuss new directions and frontiers in the area of sequential decision making.