Flexible Transformations For Learning Big Data

Azalia Mirhoseini; Ebrahim Songhori; Bita Darvish Rouhani; Farinaz Koushanfar

Flexible Transformations For Learning Big Data

Azalia Mirhoseini ,
Ebrahim Songhori ,
Bita Darvish Rouhani ,
Farinaz Koushanfar

Special Interest Group for the Computer Systems Performance Evaluation Conference, (SIGMETRICS) | June 2015

Published by ACM

Download BibTex

This paper proposes a domain-specific solution for iterative learning of big and dense (non-sparse) datasets. A large host of learning algorithms, including linear and regularized regression techniques, rely on iterative updates on the data connectivity matrix in order to converge to a solution. The performance of such algorithms often severely degrade when it comes to large and dense data. Massive dense datasets not only induce obligatory large number of arithmetics, but they also incur unwanted message passing cost across the processing nodes. Our key observation is that despite the seemingly dense structures, in many applications, data can be transformed into a new space where sparse structures become revealed. We propose a scalable data transformation scheme that enables creating versatile sparse representations of the data. The transformation can be tuned to benefit the underlying platform’s cost and constraints. Our evaluations demonstrate significant improvement in energy usage, runtime, and memory footprint, within guaranteed user-defined error bounds.