Random Kitchen Sinks: Replacing Optimization with Randomization in Learning
A popular trend in computer vision, graphics, and machine learning is to replace sophisticated statistical models with simpler generic ones, and to compensate for the missing domain knowledge with huge datasets. These huge datasets in turn require us to solve huge numerical optimization problems that tax popular off-the-shelf implementations of popular algorithms. I describe a randomized way to solve these large scale optimization problems quickly, in a few lines of code, and with provably good performance guarantees. For example, a randomized version of Adaboost and of kernelized Support Vector Machine can fit millions of data points within a few minutes with almost no loss in classification accuracy. Similarly, very large Semi-Definite Programs can be solved quickly by approximating them with suitably randomized linear programs. All of these tricks randomize over most of the variables of optimization and carry out a much cheaper optimization over the remaining variables. A theoretical analysis of these tricks relies on the concentration of measure phenomenon in Banach spaces, and guarantees that in the cases I describe, these tricks work almost as well as carrying out the full optimization. I’ll also describe a curious and related way of constructing numerical algorithms that execute reliably on unreliable CPUs that run at subthreshold voltages.
Ali Rahimi is a senior research scientist at Intel Labs Berkeley and runs the Power Aware Perception research program. He has a moustache.
- Ali Rahimi
- Intel Labs Berkeley