Portrait of Sébastien Bubeck

Sébastien Bubeck

Principal Researcher

Research Activities


  • 2014 – Present: Researcher, Microsoft Research, Theory Group (Redmond, WA, USA).
  • 2011 – 2014: Assistant Professor, Princeton University,Department of Operations Research and Financial Engineering (Princeton, NJ, USA).
  • Fall 2013: Visiting Scientist, Simons Institute, UC Berkeley (Berkeley, CA, USA).
  • 2010 – 2011: Postdoc, Centre de Recerca Matemàtica (Barcelona, Spain).
  • 2007 – 2010: Ph.D student (speciality: Applied Mathematics), INRIA Nord Europe (Lille, France).
  • 2008 – 2010: Teaching assistant at the University of Lille 1 (Lille, France).
  • July-August 2006: RIPS student (Research in Industrial Projects for Students),Institute for Pure and Applied Mathematics, UCLA (Los Angeles, CA, USA).
  • 2005 – 2008: Student at the Ecole Normale Supérieure de Cachan (Cachan, France).

For a more detailed curriculum vitae, see my resume.


  • Best Paper Award at COLT (Conference on Learning Theory) 2016.
  • 2015 Alfred P. Sloan Research Fellow in Computer Science.
  • Second prize for the best French Ph.D in Artificial Intelligence (AI prize 2011).
  • Jacques Neveu prize 2010 for the best French Ph.D in Probability/Statistics.
  • Second prize for the best French Ph.D in Computer Science (Gilles Kahn prize 2010).
  • Best Student Paper Award at COLT (Conference on Learning Theory) 2009


  • I’m a bandit. Random topics on optimization, probability, and statistics.


Convex Optimization: Algorithms and Complexity (Cover)Convex Optimization: Algorithms and Complexity

S. Bubeck

In Foundations and Trends in Machine Learning, Vol. 8: No. 3-4, pp 231-357, 2015

[pdf] [Link to buy a book version]

Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems

S. Bubeck and N. Cesa-Bianchi

In Foundations and Trends in Machine Learning, Vol 5: No 1, 1-122, 2012

[pdf] [Link to buy a book version, discount code: MAL022024]


Interns in the Theory Group at Microsoft Research

  • Fan Wei, graduate student at Stanford with Jacob Fox.
  • Shirshendu Ganguly, graduate student at UW with Ioana Dumitriu and Christopher Hoffman.
  • Ewain Gwynne, graduate student at MIT with Scott Sheffield.
  • Yin-Tat Lee, graduate student at MIT with Jonathan Kelner.
  • Miklos Racz, former graduate student at UC Berkeley with Elchanan Mossel, now postdoc in the Theory Group at MSR.

Undergraduate advising at Princeton University

  • Horia Mania, graduate student at UC Berkeley with Ben Recht.
  • Billy Fang, graduate student at UC Berkeley.
  • Christian Fong, graduate student at Stanford.
  • Tengyao Wang, graduate student at Cambridge with Richard Samworth.

Graduate students at Princeton University

  • Che-Yu Liu, ORFE



  • Kernel-based method for bandit convex optimization [youtube].
  • New Results at the Crossroads of Convexity, Learning and Information Theory [ENS video].
  • Revisiting Nesterov’s Acceleration [IMA video].







Tutorial on Bandits Games

This tutorial was presented at ALT 2011, ACML 2012, SIGMETRICS 2014, and MLSS Cadiz (2016).



In the recent years the multi-armed bandit problem has attracted a lot of attention in the theoretical learning community. This growing interest is a consequence of the large number of problems that can be modeled as a multi-armed bandit: ad placement, website optimization, packet routing, ect. Furthermore the bandit methodology is also used as a building block for more complicated scenarios such as reinforcement learning, model selection in statistics, or computer game-playing. While the basic stochastic multi-armed bandit can be traced back to Thompson (1933) and Robbins (1952), it is only very recently that we obtained an (almost) complete understanding of this simple model. Moreover many extensions of the original problem have been proposed in the past fifteen years, such as bandits without a stochastic assumption (the so-called adversarial model), or bandits with a very large (but structured) set of arms.

This tutorial will cover in details the state-of-the-art for the basic multi-armed bandit problem (both stochastic and adversarial), and the information theoretic analysis of Bayesian bandit problems. We will also touch upon contextual bandits, as well as the case of very large (possibly infinite) set of arms with linearconvexLipschitz losses.

Old version of the slides

  • Pdf (slightly updated and shortened 2014 version: Pdf)