Abstract

We present a new learning architecture: the Decision Directed Acyclic Graph (DDAG), which is used to combine many two­class classifiers into a multiclass classifier. For an N­class problem, the DDAG contains N(N-1)/2 classifiers, one for each pair of classes. We present a VC analysis of the case when the node classifiers are hyperplanes; the resulting bound on the test error depends on N and on the margin achieved at the nodes, but not on the dimension of the space. This motivates an algorithm, DAGSVM, which operates in a kernel­induced feature space and uses two­class maximal margin hyperplanes at each decision­node of the DDAG. The DAGSVM is substantially faster to train and evaluate than either the standard algorithm or Max Wins, while maintaining comparable accuracy to both of these algorithms.