Locality-Sensitive Hashing (LSH) is a powerful technique for the approximate nearest neighbor search (ANN) in high dimensions.
In this talk I will present two recent results.
- I will show a data structure for ANN for the Euclidean distance that provably outperforms the best possible LSH-based data structure. We proceed via designing a good data-dependent hash family.
- I will show a practical and optimal LSH family for the cosine similarity (a.k.a. Euclidean distance on a sphere). It substantially outperforms the celebrated Hyperplane LSH family. Along the way, I will try to debunk two popular myths about LSH: – LSH-based data structures consume too much memory and are thus impractical; – Optimal LSH constructions are too complicated to be made practical.
The talk is based on two papers: arXiv:1501.01062 (joint with Alexandr Andoni, STOC 2015) and arXiv:1509.02897 (joint with Alexandr Andoni, Piotr Indyk, Thijs Laarhoven and Ludwig Schmidt, NIPS 2015).