The need

The challenge of search is indexing billions of entries, while also finding relevant results as quickly as possible. Most searches rely on an inverted index, which uses keyword matching as well as engineering and infrastructure.

The idea

Use vectors to improve search. Deep learning models represent data as vectors, where distance between vectors reflects similarities. Approximate nearest neighbour (ANN) algorithms search billions of vectors, returning results in milliseconds.

The solution

Vector search can be applied across many applications, such as searches across text, multimedia, images, recommendations, and more. The code can be incorporated into your own applications to harness deep learning insights at scale.

Technical details for Bing vector search

Vector search uses deep learning models to encode data sets into meaningful vector representations, where distance between vectors represent the similarities between items. We then use approximate nearest neighbour (ANN) search algorithms to build vector indexes that allow us to search through billions of vectors to return the most related results in just a couple of milliseconds.

Download the architecture diagram

Vector search can be applied across a multitude of applications, such as web text search, multimedia and image search, recommendations and many more. As an example, we’ve used vector search to power an image similarity search application across multiple image databases, such as animals, cats and dogs. We first used a PyTorch pre-trained deep learning model to encode open source data sets (such as Stanford Dogs, Oxford Flowers, etc.) into vectors. We then used the Space Partition Tree and Graph (SPTAG) algorithm to generate an approximate nearest neighbour (ANN) vector index using k-means balanced trees and nearest neighbourhood graphs. When an input picture comes in, our application first uses the PyTorch model to translate the image into a vector. The query vector is then used by the SPTAG algorithm to find the most related vectors in a couple of milliseconds. The returned vectors then correspond to the images and are returned as the ‘most related’ results.

This image similarity search is just one of many possible applications of vector search in your applications today. Try incorporating vector search in your own applications today to harness deep learning insights at scale.

Resources:

Projects related to Vector Search

Browse more business scenario projects

Explore the possibilities of AI

Jump-start your own AI innovations with learning resources and development solutions from Microsoft AI.