A Parallel SGD method with Strong Convergence

Dhruv Mahajan; Keerthi Selvaraj; Sundararajan Sellamanickam; Leon Bottou

A Parallel SGD method with Strong Convergence

Dhruv Mahajan ,
Keerthi Selvaraj ,
Sundararajan Sellamanickam ,
Leon Bottou

January 2013

Download BibTex

This paper proposes a novel parallel stochastic gradient descent (SGD) method that is obtained by applying parallel sets of SGD iterations (each set operating on one node using the data residing in it) for finding the direction in each iteration of a batch descent method. The method has strong convergence properties. Experiments on datasets with high dimensional feature spaces show the value of this method.