Distributed Machine Learning Algorithms: Communication-Computation Trade-offs – Part 1

July 30, 2015
Sundararajan Sellamanickam | Microsoft
MSR India Summer School 2015 on Machine Learning

Distributed machine learning is an important area that has been receiving considerable attention from academic and industrial communities, as data is growing in unprecedented rate. In the first part of the talk, we review several popular approaches that are proposed/used to learn classifier models in the big data scenario. With commodity clusters priced on system configurations becoming popular, machine learning algorithms have to be aware of the computation and communication costs involved in order to be cost effective and efficient. In the second part of the talk, we focus on methods that address this problem; in particular, considering different data distribution settings (e.g., example and feature partitions), we present efficient distributed learning algorithms that trade-off computation and communication costs.

Speaker Details

Sundararajan Sellamanickam is a senior research scientist at Yahoo! Labs, Bangalore. He has been with Yahoo! since October 2006 working on machine learning and data mining related problems such as classification, information extraction, etc, for web applications. He got his PhD degree in 2000 with the thesis on predictive approaches for building regression models from the department of computer science and automation, Indian Institute of Science, Bangalore. Prior to joining Yahoo! Labs, he was with Philips Semiconductors playing architect/manager roles for more than 5 years, in building digital signal processing (DSP), WLAN and USB sub-systems. Before that he worked as a Scientist nearly 12 years for Defense R&D (India), in the area of designing DSP sub-systems for sonar applications.