Online Learning and Bandits – Part 1

August 10, 2015
Aditya Gopalan | IISc
MSR India Summer School 2015 on Machine Learning

The ability to make continual, accurate decisions based on evolving data is key in many of today’s data-driven intelligent systems. This tutorial-style talk presents an introduction to the modern study of sequential learning and decision making under uncertainty. The broad objective is to cover modeling frameworks for online prediction and learning, explore algorithms for decision making, and gain an understanding of their performance. Specifically, we will look at multi-armed bandits- models of decision making that capture the explore-vs-exploit tradeoff in learning, regret minimization, non-stochastic or adversarial online learning, and online convex optimization. Time permitting, we will discuss new directions and frontiers in the area of sequential decision making.

- Jeff Running
Research Lab
- Microsoft Research Lab - India
Event
- MSR India Summer School 2015 on Machine Learning

Watch Next

Dion2: A new simple method to shrink matrix in Muon
March 3, 2026
Anson Ho,

Kwangjun Ahn
ARO: A new lens on matrix optimization for LLMs
March 3, 2026
Anson Ho,

Wenbo Gong,

Chao Ma
Lessons from deploying HealthBots with experts-in-the-loop
March 3, 2026
Anson Ho,

Mohit Jain
Teaching small language models to think like optimization experts with OptiMind
March 3, 2026
Anson Ho,

Xinzhi Zhang
Agent Lightning: One learning system that makes all agents evolve
March 3, 2026
Anson Ho,

Luna K. Qiu
Magentic Marketplace: Testing societies of agents at scale
March 3, 2026
Gagan Bansal,

Anson Ho
CROSS — Leveraging AI ASICs for Homomorphic Encryption
February 26, 2026
Jianming Tong
Beyond Swahili: Designing Inclusive AI for Bantu Languages
February 17, 2026
Alfred Malengo Kondoro
Bridging Neurotechnology with Immersive Systems: Getting BCIs outside of the lab?
February 12, 2026
Hakim Si-Mohammed
Efficient Distributed Orthonormal Optimizers for Large-Scale Training
February 12, 2026
Kwangjun Ahn