Bag-of-features have recently shown very good performance for image category classification. However, their representation is orderless and based on appearance features only. In this talk we show how to integrate spatial information and how to add shape features.
First, we present a method for recognizing scene categories based on approximate global geometric correspondence. It works by partitioning the image into increasingly fine sub-regions and computing histograms of local features found inside each sub-region. The resulting “spatial pyramid” is a simple and computationally efficient extension of an orderless bag-of-features image representation, and it shows significantly improved performance on challenging scene categorization tasks.
Second, we describe a method which exploits spatial relations between features using the object boundaries provided during supervised training. It increases the weights of features that agree on position and shape of the object and suppresses the weights of background features. The proposed representation is thus richer and more robust to background clutter. Experimental results show that our approach improves over whole image classification. Furthermore, we apply the spatial model to object localization.
Third, a shape-based object detection technique is presented. It is based on pairs of connected contour segments, which are local features of intermediate complexity. Image windows are coarsely subdivided into tiles, each described by a bag of these features. After training a window classifier, novel object instances are localized via a multi-scale sliding-window mechanism. An extensive evaluation shows that the approach can successfully localize shape-based objects in cluttered scenes, while allowing for scale changes and intra-class variations.
This is joint work with V. Ferrari, F. Jurie, S. Lazebnik,
M. Marszalek and J. Ponce.