Stochastic Optimization and Sparse Statistical Recovery: An Optimal Algorithm for High Dimensions


October 30, 2012


Alekh Agarwal


Microsoft Research-NYC


We develop and analyze stochastic optimization algorithms for problems in which the expected loss is strongly convex, and the optimum is (approximately) sparse. Previous approaches are able to exploit only one of these two structures, yielding an O(d/T) convergence rate for strongly convex objectives in d dimensions, and an O(√(s log d)/T) convergence rate when the optimum is s-sparse. Our algorithm is based on successively solving a series of regularized optimization problems using Nesterov’s dual averaging algorithm. We establish that the error of our solution after T iterations is at most O((s log d)/T), with natural extensions to approximate sparsity. Our results apply to locally Lipschitz losses including the logistic, exponential, hinge and least-squares losses. By recourse to statistical minimax results, we show that our convergence rates are optimal up to multiplicative constant factors. The effectiveness of our approach is also confirmed in numerical simulations, in which we compare to several baselines on a least-squares regression problem.

[Joint work with Sahand Negahban and Martin Wainwright]


Alekh Agarwal

Alekh is a postdoctoral research at MSR NYC. Prior to that, he obtained his PhD in computer science from UC Berkeley in 2012 which was supported in part by a MSR PhD fellowship and a Google PhD fellowship. Alekh’s research encompasses many theoretical and practical aspects of large-scale machine learning, with particular emphasis on the design of computationally budgeted algorithms, large-scale convex optimization, high-dimensional inference and learning algorithms for agents that actively interact with their environment.