Decision Forests for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning

  • Antonio Criminisi ,
  • Ender Konukoglu ,
  • Jamie Shotton

MSR-TR-2011-114 |

This paper presents a unified, efficient model of random decision forests which can be applied to a number of machine learning, computer vision and medical image analysis tasks.

Our model extends existing forest-based techniques as it unifies classification, regression, density estimation, manifold learning, semi-supervised learning and active learning under the same decision forest framework. This means that the core implementation needs be written and optimized only once, and can then be applied to many diverse tasks. The proposed model may be used both in a generative or discriminative way and may be applied to discrete or continuous, labelled or unlabelled data.

The main contributions of this paper are: 1) proposing a single, probabilistic and efficient model for a variety of learning tasks; 2) demonstrating margin-maximizing properties of classification forests; 3) introducing density forests for learning accurate probability density functions; 4) proposing efficient algorithms for sampling from the forest generative model; 5) introducing manifold forests for non-linear embedding and dimensionality reduction; 6) proposing new and efficient forest-based algorithms for transductive and active learning. We discuss how alternatives such as random ferns and extremely randomized trees stem from our more general model.

This paper is directed at both students who wish to learn the basics of decision forests, as well as researchers interested in our new contributions. It presents both fundamental and novel concepts in a structured way, with many illustrative examples and real-world applications. Thorough comparisons with state of the art algorithms such as support vector machines, boosting and Gaussian processes are presented and relative advantages and disadvantages discussed.The many synthetic examples and existing commercial applications demonstrate the validity of the proposed model and its flexibility.

Powerpoint slides (with many examples and animations) are also available from