Consistency Analysis for Binary Classification Revisited
- Krzysztof Dembczyński ,
- Wojciech Kotłowski ,
- Oluwasanmi Koyejo ,
- Nagarajan Natarajan
ICML |
Statistical learning theory is at an inflection point
enabled by recent advances in understanding and
optimizing a wide range of metrics. Of particular
interest are non-decomposable metrics such
as the F-measure and the Jaccard measure which
cannot be represented as a simple average over
examples. Non-decomposability is the primary
source of difficulty in theoretical analysis, and
interestingly has led to two distinct settings and
notions of consistency. In this manuscript we
analyze both settings, from statistical and algorithmic
points of view, to explore the connections
and to highlight differences between them
for a wide range of metrics. The analysis complements
previous results on this topic, clarifies
common confusions around both settings, and
provides guidance to the theory and practice of
binary classification with complex metrics.