The linear bandit problem


January 24, 2014


Sebastien Bubeck


Princeton University


The linear bandit problem is a far-reaching extension of the classical multi-armed bandit problem. In the recent years linear bandits have emerged as a core problem of sequential decision making, somewhat analogously to what happened with linear programming in optimization or linear regression in statistics. Despite its importance we still do not have a complete picture for this problem: in some cases we have optimal strategies (from an information theoretic point of view) but they are algorithmically intractable, while in other cases we even lack information optimal strategies. In this talk I will describe precisely the situation where we stand and the contributions I made to this problem.


Sebastien Bubeck

Sebastien Bubeck is an assistant professor in the department of Operations Research and Financial Engineering at Princeton University. He joined Princeton after a postdoc at the Centre de Recerca Matematica in Barcelona, where he was working with Gabor Lugosi. He received his Ph.D. in mathematics from the University of Lille 1, advised by Remi Munos, after undergraduate studies at the Ecole Normale Superieure de Cachan. His research focuses on the mathematics of machine learning, with emphasis on problems related to multi-armed bandits. His work was recognized by several awards, such as the COLT 2009 best student paper award, and the Jacques Neveu prize 2010 for the best French Ph.D. in Probability/Statistics.