Searching Personal Photos on the Phone with Instant Visual Qery Suggestion and Joint Text-Image Hashing

ACM International Conference on Multimedia (ACM Multimedia) |

The ubiquitous mobile devices have led to the unprecedented growing of personal photo collections on the phone. One signifcant pain point of today’s mobile users is instantly fnding specifc photos of what they want. Existing applications (e.g., Google Photo and OneDrive) have predominantly focused on cloud-based solutions, while leaving the client-side challenges (e.g., query formulation, photo tagging and search, etc.) unsolved. This considerably hinders user experience on the phone. In this paper, we present an innovative personal photo search system on the phone, which enables instant and accurate photo search by visual query suggestion and joint text-image hashing. Specifcally, the system is characterized by several distinctive properties: 1) visual query suggestion (VQS) to facilitate the formulation of queries in a joint text-image form, 2) light-weight convolutional and sequential deep neural networks to extract representations for both photos and queries, and 3) joint text-image hashing (with compact binary codes) to facilitate binary image search and VQS. It is worth noting that all the components run on the phone with client optimization by deep learning techniques. We have collected 270 photo albums taken by 30 mobile users (corresponding to 37,000 personal photos) and conducted a series of feld studies. We show that our system significantly outperforms the existing client-based solutions by 10× in terms of search efciency, and 92:3% precision in terms of search accuracy, leading to a remarkably better user experience of photo discovery on the phone.