There’s a Hole in My Dataspace: Piecewise Predictors for Heterogeneous Learning Problems

Ofer Dekel; Ohad Shamir

There’s a Hole in My Dataspace: Piecewise Predictors for Heterogeneous Learning Problems

Ofer Dekel ,
Ohad Shamir

n Proceedings of the 15th International Conference on Artificial Intelligence and Statistics (AISTATS) | January 2012

Download BibTex

We study statistical learning problems where the data space is multimodal and heterogeneous, and constructing a single global predictor is difficult. We address such problems by iteratively identifying high-error regions in the data space and learning specialized predictors for those regions. While the idea of composing localized predictors is not new, our approach is unique in that we actively seek out predictors that clump errors together, making it easier to isolate the problematic regions. When errors are clumped together they are also easier to interpret and resolve through appropriate feature engineering and data preprocessing. We present an error-clumping classification algorithm based on a convex optimization problem, and an efficient stochastic optimization algorithm for this problem. We theoretically motivate our approach with a novel sample complexity analysis for piecewise predictors, and empirically demonstrate its behavior on an illustrative classification problem.