Generalization Bounds and Consistency for Latent-Structural Probit and Ramp Loss


January 17, 2012


David McAllester


Toyota Technological Institute at Chicago


Linear predictors are scale-insensitive — the prediction does not change when the weight vector defining the predictor is scaled up or down. This implies that direct regularization of the performance of a linear predictor with a scale sensitive regularizer (such as a norm of the weight vector) is meaningless. Linear predictors are typically learned by introducing a scale-sensitive surrogate loss function such as the hinge loss of an SVM. However, no convex surrogate loss function can be consistent in general — in finite dimension SVMs are not consistent. Here we generalize probit loss and ramp loss to the latent-structural setting and show that both of these loss functions are consistent in arbitrary dimension for an arbitrary bounded task loss. Empirical experience with probit loss and ramp loss will be briefly discussed.


David McAllester

Professor McAllester received his B.S., M.S., and Ph.D. degrees from the Massachusetts Institute of Technology in 1978, 1979, and 1987 respectively. He served on the faculty of Cornell University for the academic year of 1987-1988 and served on the faculty of MIT from 1988 to 1995. He was a member of technical staff at AT&T Labs-Research from 1995 to 2002. He has been a fellow of the American Association of Artificial Intelligence (AAAI) since 1997. Since 2002 he has been Chief Academic Officer at the Toyota Technological Institute at Chicago (TTIC). He has authored over 80 refereed publications. His research areas include machine learning, the theory of programming languages, automated reasoning, AI planning, computer game playing (computer chess), computational linguistics and computer vision. Dr. McAllester has a personally maintained website which can be found at