Joint work with Navneet Dalal and Ankur Agarwal.
I’ll give an overview of some of our work on human detection and human motion estimation. Regarding human detection, we have developed several feature sets for visual recognition based on grids of local Histograms of Oriented Gradients (HOG) extracted from still or moving images. We use these SIFT-like features to construct object detectors based on generalized-template-like multi-scale matching trained with a linear SVM. Athough simple, this approach turns out to work surprisingly well for human (pedestrian) detection from both still images and video sequences. I’ll also discuss some of the implementation details that help to improve the overall performance.
Regarding human motion, we have focused mainly on recovering 3D body pose and motion from monocular images. In particular, we have developed methods that directly learn to predict 3D pose from primitive image descriptors such as silhouettes or image gradients. Although somewhat imprecise, these turn out to be simpler and more robust than existing model-based approaches.