What is vector search?
An introduction to vector search/nearest neighbors
Last updated
Was this helpful?
An introduction to vector search/nearest neighbors
Last updated
Was this helpful?
Vector search is the process of finding the most similar vectors to itself (also known as nearest neighbors of similarity search).
Although we initially used the analogy of vectors as fingerprints in our introduction, vectors actually have additional properties that allow them to be useful in practical applications. These include:
Similar data have similar vectors.
You can measure the similarity of these vectors statistically in a number of different ways.
If you can find similar vectors based on the data - this means, you can provide different ways of linking data in ways individuals may have never considered. Linguistically - you can now link sentences based on semantics as opposed to relying on co-occurrences of words (used in traditional word search). Similarly, for image search, you can use reverse image search and personalised image search -- allowing for better recommendations for searches. If you are interested in vector search applications, we have listed a few below.
Vector Search is reliant largely on index libraries and encoders. Each vector search guide follows a template of:
The steps/materials can be summarised in the following way: Data: Obtain the data in a way that can be processed into a numerical representation and fed through the necessary model. Encode: Feed the data's numerical representation into a model and extract the vector which can be indexed and searched. Index: Indexing the vectors (from which the data has been encoded) in an efficient way that allows for retrieval. Search: Search the vectors that have been indexed by using a variety of Nearest Neighbor algorithms, filters, chunking and queries.
While the above process appears simple, there are a lot of difficulties with actually using vector search for production. These difficulties include:
Deploying your index and search for production
Usage of vectors to optimise search results
Optimising the way search is being done on the vectors
Optimising the matching of user intent and products
The most common algorithms that are used are called nearest neighbor algorithms. You can read more about them .