Bandit View on Noisy Optimization

Jean-Yves Audibert; Sébastien Bubeck; Rémi Munos

Bandit View on Noisy Optimization

Jean-Yves Audibert ,
Sébastien Bubeck ,
Rémi Munos

Chapter 1, in Optimization for Machine Learning

Published by MIT Press | 2010 | Optimization for Machine Learning edition

Download BibTex

This chapter deals with the problem of making the best use of a finite number of noisy evaluations to optimize an unknown function. We are primarily concerned with the case where the function is defined over a finite set. In this discrete setting, we discuss various objectives for the learner, from optimizing the allocation of a given budget of evaluations to optimal stopping time problems with (ϵ;δ)-PAC guarantees. We also consider the so-called online optimization framework, where the result of an evaluation is associated to a reward, and the goal is to maximize the sum of obtained rewards. In this case, we extend the algorithms to continuous sets and (weakly) Lipschitzian functions (with respect to a prespecified metric).