A prominent approach in collaborative filtering based recommender systems is using dimensionality reduction (matrix factorization) techniques to map users and items into low-dimensional vectors. In such systems, a higher inner product between a user vector and an item vector indicates that the item better suits the user’s preference. Traditionally, retrieving the most suitable items is done by scoring and sorting all items. Real world online recommender systems must adhere to strict response-time constraints, so when the number of items is large, scoring all items is intractable.
We propose a novel order preserving transformation, mapping the maximum inner product search problem to Euclidean space nearest neighbor search problem. Utilizing this transformation, we study the efficiency of several (approximate) nearest neighbor data structures. Our final solution is based on a novel use of the PCA-Tree data structure in which results are augmented using paths one hamming distance away from the query (neighborhood boosting). The end result is a system which allows approximate matches (items with relatively high inner product, but not necessarily the highest one). We evaluate our techniques on two large-scale recommendation datasets, Xbox Movies and Yahoo Music, and show that this technique allows trading off a slight degradation in the recommendation quality for a significant improvement in the retrieval time.