Spatial Coding for Large-scale Partial-duplicate Image Search

  • Qi Tian | University of Texas at San Antonio (UTSA)

Bag-of-visual-words model is widely used in the state-of-the-art large-scale image retrieval system. It represents each image as a bag of visual words by quantizing local image descriptors to the closest visual words. However, feature quantization reduces the discriminative power of local features, which causes many false visual word matches. Recently, some geometric verification methods are proposed to check the geometric consistency of matched features in a post-processing step. Although retrieval precision is improved, either the computational cost is too expensive to ensure real-time response, or they are limited to local verification. To address this dilemma, we propose a novel scheme, Spatial Coding, designed for large scale partial-duplicate image retrieval. The spatial relationships among visual words are encoded in global region maps. Based on the region maps, a spatial verification approach is developed, which can detect false matches of local features efficiently, and consequently improve retrieval performance greatly.

Experiments in partial-duplicate image retrieval, using a database of one million images from Image-Net, reveal that our approach can effectively detect duplicate images with rotation, scale changes, occlusion, and background clutter with very low computational cost. The spatial coding achieve an 53% improvement in mean average precision and 46% reduction in time cost over the baseline Bag-of-Visual-Words approach, respectively. They perform even better than full geometric verification while being much less computationally expensive. Our demo on 10-million dataset further reveals the scalability of our approach.

Speaker Details

Qi Tian is currently an Associate Professor in the Department of Computer Science, the University of Texas at San Antonio (UTSA). During 2008-2009, he took one-year Faculty Leave at Microsoft Research Asia (MSRA) in the Media Computing Group. He received his Ph.D. in ECE from University of Illinois at Urbana-Champaign (UIUC). Dr. Tian’s research interests focus on multimedia information retrieval and published over 150 refereed journal and conference papers. He was the co-author of a Top 10% Paper Award in MMSP 2011, a Best Student Paper in ICASSP 2006, and co-author of a Best Paper Candidate in PCM 2007. His research projects are funded by NSF, ARO, DHS, Google, FXPAL, NEC, SALSI, CIAS, Akiira Media Systems, HP and UTSA. He received 2010 ACM Service Award. He is the Guest Editors of IEEE Transactions on Multimedia, Journal of Computer Vision and Image Understanding, etc, and is the associate editor of IEEE Transaction on Circuits and Systems for Video Technology (TCSVT) and in the Editorial Board of Journal of Multimedia (JMM), Journal of Machine Vision and Applications (MVA).

    • Portrait of Jeff Running

      Jeff Running

    • Portrait of Qiang Tian

      Qiang Tian

Series: Microsoft Research Talks