The Median Hypothesis

  • Ran Gilad-Bachrach ,
  • Chris J.C. Burges

MSR-TR-2012-56 |

A typical learning process begins with some prior beliefs about the target concept. These beliefs are rened using evidence, typically a sample. The evidence is used to compute a posterior belief, which assigns a probability distribution over the hypothesis class F. Using the posterior, a hypothesis is selected to predict the labels during the generalization phase. A natural candidate is the Bayes hypothesis, that is, the weighted majority vote. However, this hypothesis is often prohibitively costly to compute. Therefore, there is a need for an algorithm to construct a hypothesis (either a single hypothesis from F, or from a dierent class), given the posterior belief. Several methods have been proposed for this problem: for example, Gibbs sampling, ensemble methods, and choosing the maximum posterior. We propose a new method: choosing the median hypothesis. This method is close to the average Gibbs hypothesis and the Bayes hypothesis in terms of accuracy while having the same run-time eciency, during the generalization phase, as the maximum posterior method.

In this paper, we dene a measure of depth for hypotheses, from which we derive the median hypothesis. We prove generalization bounds which leverage the PAC-Bayes analysis technique. We present an algorithm to approximate the median hypotheses and we prove its correctness. Our denition of median is closely related to Tukey’s median; in fact, to the best of our knowledge, our algorithm is the rst polynomial approximation algorithm to the Tukey median.