Abstract

Recently, binary codes have been widely used in many multimedia applications to approximate high-dimensional multimedia features for practical similarity search due to the highly compact data representation and efficient distance computation. While the majority of the hashing methods aim at learning more accurate hash codes, only a few of them focus on indexing methods to accelerate the search for binary code databases. Among these indexing methods, most of them suffer from extremely high memory cost or extensive Hamming distance computations. In this paper, we propose a new Hamming distance search scheme for large scale binary code databases, which is free of Hamming distance computations to return the exact results. Without the necessity to compare database binary codes with queries, the search performance can be improved and databases can be externally maintained. More specifically, we adopt the inverted multi-index data structure to index binary codes. Importantly, the Hamming distance information embedded in the structure is utilized in the designed search scheme such that the verification of exact results no longer relies on Hamming distance computations. As a step further, we optimize the performance of the inverted multi-index structure by taking the code distributions among different bits into account for index construction. Empirical results on large-scale binary code databases demonstrate the superiority of our method over existing approaches in terms of both memory usage and search efficiency.