Learning to Classify with Missing and Corrupted Features

Ofer Dekel; Ohad Shamir; Lin Xiao

Learning to Classify with Missing and Corrupted Features

Ofer Dekel ,
Ohad Shamir ,
Lin Xiao

Machine Learning Journal | June 2009

Download BibTex

A common assumption in supervised machine learning is that the training examples provided to the learning algorithm are statistically identical to the instances encountered later on, during the classification phase. This assumption is unrealistic in many real-world situations where machine learning techniques are used.We focus on the case where features of a binary classification problem, which were available during the training phase, are either deleted or become corrupted during the classification phase. We prepare for the worst by assuming that the subset of deleted and corrupted features is controlled by an adversary, and may vary from instance to instance. We design and analyze two novel learning algorithms that anticipate the actions of the adversary and account for them when training a classifier. Our first technique formulates the learning problem as a linear program. We discuss how the particular structure of this program can be exploited for computational efficiency and we prove statistical bounds on the risk of the resulting classifier. Our second technique addresses the robust learning problem by combining a modified version of the Perceptron algorithm with an online-to-batch conversion technique, and also comes with statistical generalization guarantees. We demonstrate the effectiveness of our approach with
a set of experiments.