Decision Forests: A Unified Framework for Classification, Regression, Density Estimation, Manifold Learning and Semi-Supervised Learning

in Foundations and Trends® in Computer Graphics and Vision

Published by NOW Publishers | 2012, Vol 7 | Foundations and Trends® in Computer Graphics and Vision: Vol. 7: No 2-3, pp 81-227 edition

This review presents a unified, efficient model of random decision forests which can be applied to a number of machine learning, computer vision, and medical image analysis tasks. Our model extends existing forest-based techniques as it unifies classification, regression, density estimation, manifold learning, semi-supervised learning, and active learning under the same decision forest framework.This gives us the opportunity to write and optimize the core implementation only once, with application to many diverse tasks.

The proposed model may be used both in a discriminative or generative way and may be applied to discrete or continuous, labeled or unlabeled data. The main contributions of this review are: (1) Proposing a unified, probabilistic and efficient model for a variety of learning tasks; (2) Demonstrating margin-maximizing properties of classification forests; (3) Discussing probabilistic regression forests in comparison with other nonlinear regression algorithms; (4) Introducing density forests for estimating probability density functions; (5) Proposing an efficient algorithm for sampling from a density forest; (6) Introducing manifold forests for nonlinear dimensionality reduction; (7) Proposing new algorithms for transductive learning and active learning.

Finally, we discuss how alternatives such as random ferns and extremely randomized trees stem from our more general forest model. This document is directed at both students who wish to learn the basics of decision forests, as well as researchers interested in the new contributions. It presents both fundamental and novel concepts in a structured way, with many illustrative examples and real-world applications. Thorough comparisons with state-of-the-art algorithms such as support vector machines, boosting and Gaussian processes are presented and relative advantages and disadvantages discussed. The many synthetic examples and existing commercial applications demonstrate the validity of the proposed model and its flexibility