Data-Driven Approach for Bridging the Cognitive Gap in Image Retrieval
Bridging the cognitive gap in image retrieval has been an active research direction in recent years. Existing solutions typically require a large volume of training data that could be difficult to obtain in practice. In this paper, we propose a data-driven approach that uses Web images and their surrounding textual annotations as the source of training data to bridge the cognitive gap. We construct an image thesaurus that contains a set of codewords, each representing a semantically related subspace in the feature space. We also explore the use of query expansion based on the constructed image thesaurus for improving image retrieval performance.