Foundations of Optimization

Foundations of Optimization




Optimization methods are the engine of machine learning algorithms. Examples abound, such as training neural networks with stochastic gradient descent, segmenting images with submodular optimization, or efficiently searching a game tree with bandit algorithms. We aim to advance the mathematical foundations of both discrete and continuous optimization and to leverage these advances to develop new algorithms with a broad set of AI applications.

Some of the current directions pursued by our members include convex optimization, distributed optimization, online optimization, non-convex optimization, and discrete optimization for random problems:

  • Convex optimization is the bedrock of many machine learning algorithms, including logistic regression or kernel machines. We pursue a deeper understanding of some of the fundamental techniques in the field, such as accelerated gradient descent and interior point methods.
  • Distributed optimization algorithms are critical for solving large-scale machine learning problems in the cloud, for example, those arising in search and online advertising. We develop algorithms that tackle problems with big data as well as huge models, and that can adapt to the asynchronous environment over heterogeneous computing resources.
  • Online optimization is a recent model that overcomes the standard stationarity assumption in stochastic optimization, making it particularly well-suited for learning applications in dynamic settings (for example, online advertising). Many challenging problems remain open in that model, particularly in the bandit optimization variant where feedback comes only from direct experimentation.
  • The success of deep learning has led to a renewed interest in non-convex optimization. We explore the possibilities and limitations of principled approaches in this setting.
  • Many fundamental problems in machine learning revolve around the estimation of discrete structures such as a graph, or a cut. We study how the random landscape of machine learning problems make those typically hard problems easier.