Hierarchical Classification via Orthogonal Transfer

MSR-TR-2011-54 |

This technical report is identical to our paper in ICML 2011 with the same title, except that we added the proofs of the theorems as appendices, which were omitted in the ICML paper due to space limit.

View Publication

We consider multiclass classification problems where the set of labels are organized hierarchically as a category tree. We associate each node in the tree with a classifier and classify the examples recursively from the root to the leaves. We propose a hierarchical Support Vector Machine (SVM) that encourages the classifier at each node to be different from the classifiers at its ancestors. More specifically, we introduce regularizations that force the normal vector of the classifying hyperplane at each node to be orthogonal to those at its ancestors as much as possible. We establish conditions under which training such a hierarchical SVM is a convex optimization problem, and develop an efficient dual-averaging method for solving it. We evaluate the method on a number of real-world text categorization tasks and obtain state-of-the-art performance.