Accelerating Stochastic Gradient Descent

July 12, 2017
Sham Kakade | University of Washington

There is widespread sentiment that it is not possible to effectively utilize fast gradient methods (e.g. Nesterov’s acceleration, conjugate gradient, heavy ball) for the purposes of stochastic optimization due to their instability and error accumulation, a notion made precise in dAspremont 2008 and Devolder, Glineur, and Nesterov 2014. This work considers the use of “fast gradient” methods for the special case of stochastic approximation for the least squares regression problem. Our main result refutes the conventional wisdom by showing that acceleration can be made robust to statistical errors. In particular, this work introduces an accelerated stochastic gradient method that provably achieves the minimax optimal statistical risk faster than stochastic gradient descent. Critical to the analysis is a sharp characterization of accelerated stochastic gradient descent as a stochastic process.

- Sébastien Bubeck
  
  Vice President, Microsoft GenAI
Research Area
- Algorithms
- Mathematics

Watch Next

Convergence Analysis for Fast High-Order ODE Solvers in Diffusion Probabilistic Models
July 7, 2026
Zhengjiang Lin
Welcome Session - Microsoft Research India Academic Summit 2026
June 9, 2026
Venkat Padmanabhan,

Srinivasan Iyengar
Microsoft Research India 2025 Highlights
December 31, 2025
Microsoft Research India - The evolution
March 1, 2025
Venkat Padmanabhan,

P. Anandan,

Rick Rashid

, et. al.
Microsoft Research India - The lab culture
March 1, 2025
P. Anandan,

Indrani Medhi Thies,

B. Ashok

, et. al.
GenAI for Supply Chain Management: Present and Future
February 14, 2025
Georg Glantschnig,

Beibin Li,

Konstantina Mellou

, et. al.
Using Optimization and LLMs to Enhance Cloud Supply Chain Operations
December 2, 2024
Beibin Li,

Konstantina Mellou,

Ishai Menache

, et. al.
AI for Business Transformation: Lessons from Healthcare
September 3, 2024
Gretchen Huizinga,

Peter Lee,

Vijay Mital
AI for Business Transformation: Multimodal Models
September 3, 2024
Gretchen Huizinga,

Peter Lee,

Vijay Mital
AI for Business Transformation: The Business of Data
September 3, 2024
Gretchen Huizinga,

Peter Lee,

Vijay Mital

Accelerating Stochastic Gradient Descent

Sébastien Bubeck

Research Area

Watch Next