Accelerating Stochastic Gradient Descent
- Sham Kakade | University of Washington
There is widespread sentiment that it is not possible to effectively utilize fast gradient methods (e.g. Nesterov’s acceleration, conjugate gradient, heavy ball) for the purposes of stochastic optimization due to their instability and error accumulation, a notion made precise in dAspremont 2008 and Devolder, Glineur, and Nesterov 2014. This work considers the use of “fast gradient” methods for the special case of stochastic approximation for the least squares regression problem. Our main result refutes the conventional wisdom by showing that acceleration can be made robust to statistical errors. In particular, this work introduces an accelerated stochastic gradient method that provably achieves the minimax optimal statistical risk faster than stochastic gradient descent. Critical to the analysis is a sharp characterization of accelerated stochastic gradient descent as a stochastic process.
-
-
Sébastien Bubeck
Vice President, Microsoft GenAI
-
-
Watch Next
-
-
-
From Microfarms to the Moon: A Teen Innovator’s Journey in Robotics
- Pranav Kumar Redlapalli
-
-
Microsoft Research India - The lab culture
- P. Anandan,
- Indrani Medhi Thies,
- B. Ashok
-
GenAI for Supply Chain Management: Present and Future
- Georg Glantschnig,
- Beibin Li,
- Konstantina Mellou
-
Using Optimization and LLMs to Enhance Cloud Supply Chain Operations
- Beibin Li,
- Konstantina Mellou,
- Ishai Menache
-
-
-