State-of-the-art image retrieval systems achieve scala-
bility by using bag-of-words representation and textual re-
trieval methods, but their performance degrades quickly in
the face image domain, mainly because they 1) produce vi-
sual words with low discriminative power for face images,
and 2) ignore the special properties of the faces. The lead-
ing features for face recognition can achieve good retrieval
performance, but these features are not suitable for inverted
indexing as they are high-dimensional and global, thus not
scalable in either computational or storage cost.

In this paper we aim to build a scalable face image re-
trieval system. For this purpose, we develop a new scal-
able face representation using both local and global fea-
tures. In the indexing stage, we exploit special proper-
ties of faces to design new component-based local features,
which are subsequently quantized into visual words using
a novel identity-based quantization scheme. We also use a
very small hamming signature (40 bytes) to encode the dis-
criminative global feature for each face. In the retrieval
stage, candidate images are firstly retrieved from the in-
verted index of visual words. We then use a new multi-
reference distance to re-rank the candidate images using
the hamming signature. On a one-millon face database,
we show that our local features and global hamming signa-
tures are complementary—the inverted index based on local
features provides candidate images with good recall, while
the multi-reference re-ranking with global hamming signa-
ture leads to good precision. As a result, our system is not
only scalable but also outperforms the linear scan retrieval
system using the state-of-the-art face recognition feature in
term of the quality.