Abstract

Many practical problems in pattern recognition require making inferences using multiple modalities, e.g. sensor data from video, audio, physiological changes etc. Often in real-world scenarios there can be incompleteness in the training data. There can be missing channels due to sensor failures in multi-sensory data and many data points in the training set might be unlabeled. Further, instead of having exact labels we might have easy to obtain coarse labels that correlate with the task. Also, there can be labeling errors, for example human annotation can lead to incorrect labels in the training data.

The discriminative paradigm of classification aims to model the classification boundary directly by conditioning on the data points; however, discriminative models cannot easily handle incompleteness since the distribution of the observations is never explicitly modeled. We present a unified Bayesian framework that extends the
discriminative paradigm to handle four different kinds of incompleteness. First, a solution based on a mixture of Gaussian processes is proposed for achieving sensor fusion under the problematic conditions of missing channels. Second, the framework addresses incompleteness resulting from partially labeled data using input dependent regularization. Third, we introduce the located hidden random field (LHRF) that learns finer level labels when only some easy to obtain coarse information is available. Finally the proposed framework can handle incorrect labels, the fourth case of incompleteness. One of the advantages of the framework is that we can use different models for different kinds of label errors, providing a way to encode prior knowledge about the process.

The proposed extensions are built on top of Gaussian process classification and result in a modular framework where each component is capable of handling different kinds of incompleteness. These modules can be combined in many different ways, resulting in many different algorithms within one unified framework. We demonstrate the effectiveness of the framework on a variety of problems such as multi-sensor affect recognition, image classification and object detection and segmentation.

Thesis Supervisor: Rosalind W. Picard
Title: Professor of Media Arts and Sciences