Using Asymmetric Distributions to Improve Classiï¬er Probabilities: A Comparison of New and Standard Parametric Methods

Paul N. Bennett

Using Asymmetric Distributions to Improve Classiï¬er Probabilities: A Comparison of New and Standard Parametric Methods

Paul N. Bennett

CMU-CS-02-126 | April 2002

Computer Science Department, School of Computer Science, Carnegie Mellon University (See errata at http://www.cs.cmu.edu/~pbennett/papers/errata-for-asymmetric.html . Also a revised version of this work appears in SIGIR 2003.)

Download BibTex

For many discriminative classiﬁers, it is desirable to convert an unnormalized conﬁdence score output from the classiﬁer to a normalized probability estimate. Such a method can also be used for creating better estimates from a probabilistic classiﬁer that outputs poor estimates. Typical parametric methods have an underlying assumption that the score distribution for a class is symmetric; we motivate why this assumption is undesirable, especially when the scores are output by a classiﬁer. Two asymmetric families, an asymmetric generalization of a Gaussian and a Laplace distribution, are presented, and a method of ﬁtting them in expected linear time is described. Finally, an experimental analysis of parametric ﬁts to the outputs of two text classiﬁers, naive Bayes (which is known to emit poor probabilities) and a linear SVM, is conducted. The analysis shows that one of these asymmetric families is theoretically attractive (introducing few new parameters while increasing ﬂexibility), computationally efﬁcient, and empirically preferable.

Using Asymmetric Distributions to Improve Classiï¬er Probabilities: A Comparison of New and Standard Parametric Methods

Using Asymmetric Distributions to Improve Classiï¬er Probabilities: A Comparison of New and Standard Parametric Methods