Classification via Minimum Incremental Coding Length (MICL)
SIAM Journal on Imaging Science |
We present a simple new criterion for classiﬁcation, based on principles from lossy data compression. The criterion assigns a test sample to the class that uses the minimum number of additional bits to code the test sample, subject to an allowable distortion. We demonstrate the asymptotic optimality of this criterion for Gaussian distributions and analyze its relationships to classical classiﬁers. The theoretical results clarify the connections between our approach and popular classiﬁers such as MAP, RDA, k-NN, and SVM, as well as unsupervised methods based on lossy coding. Our formulation induces severa lgood effects on the resulting classiﬁer. First, minimizing the lossy coding length induces a regularization effect which stabilizes the (implicit) density estimate in a small sample setting. Second, compression provides a uniform means of handling classes of varying dimension. The new criterion and its kernel and local versions perform competitively on synthetic examples, as well as on real imagery data such as handwritten digits and face images. On these problems, the performance of our simple classiﬁer approaches the best reported results, without using domain-speciﬁc information. All MATLAB code and classiﬁcation results are publicly available for peer evaluation at http://perception.csl.uiuc.edu/coding/home.htm.