Research Activities
-
- 2014 – Present: Researcher, Microsoft Research, Theory Group (Redmond, WA, USA).
- 2011 – 2014: Assistant Professor, Princeton University,Department of Operations Research and Financial Engineering (Princeton, NJ, USA).
- Fall 2013: Visiting Scientist, Simons Institute, UC Berkeley (Berkeley, CA, USA).
- 2010 – 2011: Postdoc, Centre de Recerca Matemàtica (Barcelona, Spain).
- 2007 – 2010: Ph.D student (speciality: Applied Mathematics), INRIA Nord Europe (Lille, France).
- 2008 – 2010: Teaching assistant at the University of Lille 1 (Lille, France).
- July-August 2006: RIPS student (Research in Industrial Projects for Students),Institute for Pure and Applied Mathematics, UCLA (Los Angeles, CA, USA).
- 2005 – 2008: Student at the Ecole Normale Supérieure de Cachan (Cachan, France).
For a more detailed curriculum vitae, see my resume.
-
- Outstanding Paper Award at NeurIPS 2021.
- Best Paper Award at COLT (Conference on Learning Theory) 2016.
- 2015 Alfred P. Sloan Research Fellow in Computer Science.
- Second prize for the best French Ph.D in Artificial Intelligence (AI prize 2011).
- Jacques Neveu prize 2010 for the best French Ph.D in Probability/Statistics.
- Second prize for the best French Ph.D in Computer Science (Gilles Kahn prize 2010).
- Best Student Paper Award at COLT (Conference on Learning Theory) 2009
-
- I’m a bandit. Random topics on optimization, probability, and statistics.
-
Convex Optimization: Algorithms and Complexity
S. Bubeck
In Foundations and Trends in Machine Learning, Vol. 8: No. 3-4, pp 231-357, 2015
[pdf] [Link to buy a book version]
Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems
S. Bubeck and N. Cesa-Bianchi
In Foundations and Trends in Machine Learning, Vol 5: No 1, 1-122, 2012
[pdf] [Link to buy a book version, discount code: MAL022024]
-
Interns in the Theory Group at Microsoft Research
- Fan Wei, graduate student at Stanford with Jacob Fox.
- Shirshendu Ganguly, graduate student at UW with Ioana Dumitriu and Christopher Hoffman.
- Ewain Gwynne, graduate student at MIT with Scott Sheffield.
- Yin-Tat Lee, graduate student at MIT with Jonathan Kelner.
- Miklos Racz, former graduate student at UC Berkeley with Elchanan Mossel, now postdoc in the Theory Group at MSR.
Undergraduate advising at Princeton University
- Horia Mania, graduate student at UC Berkeley with Ben Recht.
- Billy Fang, graduate student at UC Berkeley.
- Christian Fong, graduate student at Stanford.
- Tengyao Wang, graduate student at Cambridge with Richard Samworth.
Graduate students at Princeton University
- Che-Yu Liu, ORFE
-
2016
- Kernel-based method for bandit convex optimization [youtube].
- New Results at the Crossroads of Convexity, Learning and Information Theory [ENS video].
- Revisiting Nesterov’s Acceleration [IMA video].
2015
- Convex bandits (two lectures, one by myself and one by Ronen Eldan) [CIMI videos].
- Log-concave sampling with projected langevin monte carlo [IMA video].
- Entropic barrier [IMA video] (short version: [COLT 2015 videolectures.net]).
2014
- Influence of the seed in uniform attachment [MSR video] (preliminary results on the influence of the seed in PA [youtube]).
- Linear bandits [MSR video].
- Most correlated arms identification [COLT 14 videolectures.net].
2013
- Multiple best arms identification and optimal discovery with probabilistic expert advice [MSR video]. Same talk a year before at ICML 2012: [techtalks video].
- 0th order stochastic Lipschitz optimization [youtube].
- Clique number in random geometric graph [youtube].
- Best of both worlds in bandits [NIPS 13 videolectures.net].
- Bounded regret in multi-armed bandits [COLT 13 videolectures.net].
2012
- Towards minimax policies for linear bandits [COLT 12 techtalks].
2011
- Combinatorial bandits [COLT 11 videolectures.net].
-
Tutorial on Bandits Games
This tutorial was presented at ALT 2011, ACML 2012, SIGMETRICS 2014, and MLSS Cadiz (2016).
Slides
Abstract
In the recent years the multi-armed bandit problem has attracted a lot of attention in the theoretical learning community. This growing interest is a consequence of the large number of problems that can be modeled as a multi-armed bandit: ad placement, website optimization, packet routing, ect. Furthermore the bandit methodology is also used as a building block for more complicated scenarios such as reinforcement learning, model selection in statistics, or computer game-playing. While the basic stochastic multi-armed bandit can be traced back to Thompson (1933) and Robbins (1952), it is only very recently that we obtained an (almost) complete understanding of this simple model. Moreover many extensions of the original problem have been proposed in the past fifteen years, such as bandits without a stochastic assumption (the so-called adversarial model), or bandits with a very large (but structured) set of arms.
This tutorial will cover in details the state-of-the-art for the basic multi-armed bandit problem (both stochastic and adversarial), and the information theoretic analysis of Bayesian bandit problems. We will also touch upon contextual bandits, as well as the case of very large (possibly infinite) set of arms with linearconvexLipschitz losses.
Old version of the slides