The dataset, named Clickture, was sampled from one-year click log of a commercial image search engine. It consists of a big table with 212:3 million triads: Clickture = {
Through users’ click action during image search, the query Q in the triad is linked to the image K. In general, the bigger the click count C is, the higher probability that the corresponding query is relevant to the image. For convenience, we call Q a “clicked query” of Image K, and K a “clicked image” of query Q, and call 〈K,Q〉 a “clicked image-query pair”, and the triad 〈K,Q,C〉 as “click data”. We also call “clicked queries” of an image as “labels” of the image.
To enable the use of Clickture by a wide range of research organizations and individuals with different computing, networking, storage and programing capacities, a subset of Clickture images (1 million images and 11.7 million queries), is provided. We call this set Clickture-Lite and the full 40M dataset Clickture-Full (or in brief Clickture). The 1M images in Clickture-Lite are randomly sampled from the 40M image dataset (based on click frequency).
Related Events
- ACM Multimedia Grand Challenge 2014 (opens in new tab) (Based on Clickture-Lite and optionally Clickture-Full)
- ICME Grand Challenge 2014 (Based on Clickture-Lite)
- MSR-Bing Image Retrieval Grand Challenge 2013 (Based on Clickture-Lite)