Use Vector Search to build adaptive and user-focused applications using Generative AI.
About Vector Search
Vector Search is a technique to retrieve semantically similar items based on vector embedding representations of the items in a multi-dimensional space. You can use Vector Search to find the top N items similar to a given item based on their vector representations. Vector Search is an essential component of Generative AI and Predictive AI applications.
Vector Search is a sophisticated data retrieval technique that focuses on matching the contextual meanings of search queries and data entries, rather than simple text matching. Vectors are represented by arrays of numbers known as an embedding, which are generated by Large Language Models (LLMs) to depict objects such as text, images, and audio.
Once you choose the LLM you wish to integrate in your application, you can create vector indexes that will store these embeddings for improved search performance and start querying against them.
Applications of Vector Search
You can use Vector Search to enhance your mobile and edge applications in a variety of use cases, these include:
-
Perform Semantic and Similarity Search on the Edge - Any offline-first mobile or IoT application can benefit from rich semantic text capabilites offered by Vector Search to retrieve contextually relevant data and present it to users.
-
Create Recommendation Engines - Vector Search enables the creation of advanced recommendation systems that analyze semantic similarities between items, user behavior, and preferences. This approach delivers personalized recommendations, improving user engagement and satisfaction.
-
Enhance the contextual relevance of your applications with Retrieval Augmented Generation (RAG) - RAG is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources that are contextual to your application, such as an internal vector database. The search results are then included as context data to queries sent to the LLM in order to customize query responses. This results in the LLM returning a result that is likely to be far more contextually relevant than the standalone prompt.
Additionally, Vector Search in Couchbase Lite provides the following benefits:
-
Unified Cloud-to-Edge Support for Vector Similarity Search - Couchbase supports Vector Search from cloud to edge, which enables applications to efficiently utilize cloud and edge computing’s strengths.
-
Enhanced Data Privacy on the Edge - By performing Vector Search within the device, personal data and search queries of a sensitive nature do not have to leave the device.
-
Low Latency Application Support - You can run searches locally against a local dataset using a local embedded model. This eliminates the network variability and results in more consistent execution speed. Even in the case where the model is not embedded within the local device but is deployed at the edge location, the round trip time (RTT) associated with queries can be significantly reduced compared to searches made over the Internet.
-
Cost Per Query Reduction - When you have hundreds of thousands of connected clients querying against a cloud-based LLM, the load on cloud model and operational costs of running the cloud based model can be considerably high. By running queries locally on the device, you can save on data transfer costs and cloud egress charges while also decentralizing the operational costs.
Key Concepts of Vector Search in Couchbase Lite
When working with Vector Search, you should be aware of the core concepts below.
About Vector Embeddings
Vector embeddings represent the output of a Machine Learning (ML) model as an array of numbers to capture semantic or contextual relationships between data points. This representation encodes how a ML model understands the input or inputs provided to it, based on how the model was initially trained and the internal structure of the model. When a model considers the features of a given input as similar, the distance between the vector embeddings will be short. Vector embeddings are stored within embedded vector indexes.
The current supported formats for Vector embeddings are:
-
An array of 32-bit floats.
-
A Base64 string that encodes a Little-endian array of 32-bit floats.
About Vector Indexes
Vector Indexes are used to store and manage vector representations of content in the form of vector embeddings. You can use Vector Indexes to efficiently retrieve vectors, similar to a target vector. Before use, a Vector Index needs to be trained to compute the centroids and parameters for encoding the vectors.
You can configure both the minimum and maximum training sizes for your vector index by setting the relevant parameters in your vector index configuration. For an example of how you can do so, see create a vector index.
Couchbase Lite Vector Search initiates training on the first Vector Search query automatically when the number of vectors to be trained satisfies the minimum-training-size
configuration.
If the database does not contain the required number of vectors, an error message will be logged indicating the required number of vectors.
See Vector Index Configuration for more information about configurations you can modify for your vector index.
Be aware that vector index training can affect query performance. If a query is executed against the index before it is trained, a full scan of the vectors will be performed.
About Lazy Vector Indexes
Lazy index is not an automatic process, you will need to manually schedule the index updates. |
Lazy vector indexes (lazy index) is Couchbase Lite specific functionality that updates indexes asynchronously, satisfying the following use cases:
-
Documents have been added to the application by the end-user with no available Machine Learning (ML) model to generate the vectors. You can use lazy indexing to schedule updates to the index of such documents when a ML model is available.
-
The remote ML model used stops working or has intermittent availability, causing a failed update. With lazy indexing, you can skip documents that fail to update and schedule the index process at a later date.
Lazy indexing is an asynchronous process that provides developers full control of:
-
When to update the index.
-
The number of vectors to update to the index.
-
Whether to cancel or skip certain indexes when the model is unavailable or has failed.
Updating in lazy index is an independent process from saving document operations. |
See here for examples of how to use lazy index in your applications.
Lazy indexing is an alternate approach to using the standard predictive model with regular vector indexes which handle the indexing process automatically. The table below compares the two processes.
Feature | Regular Index | Lazy Index |
---|---|---|
Update when documents are changed |
||
Update when documents are deleted or purged |
||
The application has control when to update the index |
||
The application can skip updating the index |
||
Is an asynchronous process |
About Vector Encoding
Vector encoding reduces the size of the vectors index by algorithmic compression. You can configure the Vector Encoding in Couchbase Lite to address your application’s needs.
This vector encoding compression reduces disk space required and I/O time during indexing and queries, but greater compression can result in inaccurate results in distance calculations.
Vector Search for Couchbase Lite supports the following encoding algorithms:
-
None - This will return the highest quality results but at high performance and disk space costs.
-
Scalar Quantizer - This reduces the number of bits used for each number in a vector. The number of bits per component can be set to 4, 6, or 8 bits. The default setting in Couchbase Lite is 8 bits Scalar Quantizer or SQ-8.
-
Product Quantizer - This reduces the number of dimensions and bits per dimension. It splits the vectors into multiple subspaces and performing scalar quantization on each space independently before compression. This can produce higher quality results than Scalar Quantization at the cost of greater complexity.
Quantizers are algorithmic processes that map input values from a larger set to output values in a smaller set, common quantization processes can include operations such as rounding and truncation. |
About Centroids
Centroids are vectors that function as the center point of a vector cluster within the data set. Each vector is then associated to the vector it is closest to by k-means clustering. Each Centroid is contained within a bucket along with its associated vectors. The greater the number of Centroids, the greater the potential accuracy of the model. However, a greater number of Centroids will incur a longer indexing time.
Choosing Centroids in Vector Search involves trade-offs that can impact clustering effectiveness and search efficiency. The initial selection of Centroids, the number chosen, and their sensitivity to high dimensionality and outliers affect the quality of vector clustering.
The general guideline for the optimum number of Centroids is approximately the square root of the number of documents.
About Probes
The number of Probes refers to the maximum number of Centroid buckets that the search algorithm will check to look for similar vectors to a given query vector.
You can change the number of Probes by altering the value of the NumProbes
variable shown in the following example.
Couchbase recommends that when setting a custom number of probes, the number should be at least 8 or 0.5% the number of Centroids used.
About Dimensions
Vector dimensions describes the amount of numbers in a given vector embedding, commonly known as its width. The greater the number of dimensions, the greater accuracy of results. However, a greater number of dimensions also results in greater compute and memory costs and an increase in the latency of the search. Vector dimensions are dependent on the LLM used to generate the Vector Embeddings.
Couchbase Lite supports dimension sizes in the range of 2 - 4096 .
|
About Distance Metrics
Distance metrics are functions used to define how close an input query vector is to other vectors within a vector index.
Couchbase Lite supports the following distance metrics:
-
Squared Euclidean Distance - This is the default distance metric. This measures the straight-line distance between two points in Euclidean space which is defined by n dimensions, such as x,y,z. This metric focuses on the spatial separation or distance between two vectors. Both the magnitude and direction of the vectors matter. The smaller the distance value, the more similar the vectors are. You can use this metric to simplify computation in situations where only the relative distance matters, rather than actual distance.
-
Euclidean Distance - This measures the straight-line distance between two points in Euclidean space which is defined by n dimensions, such as x,y,z. This metric focuses on the spatial separation or distance between two vectors. Both the magnitude and direction of the vectors matter. The smaller the distance value, the more similar the vectors are. This differs from Squared Euclidean Distance by taking the square root of the calculated distance between two point. The result is a "true" geometric distance. You can use this metric when the actual geometric distance matters, such as calculating distance between cities using GPS coordinates.
-
Cosine Distance - This measures the cosine of the angle between two vectors in vector space. This metric focuses on the alignment of two vectors, the similarity of direction. Only the direction of the vectors matter. The smaller the distance value, the more similar the vectors are. You can use this metric when comparing similarity of document content no matter the document size in text similarity or information retrieval applications.
-
Dot Product Distance - This metric captures the overall similarity by comparing the magnitude and direction of vectors. The result is larger when the when the vectors are aligned and have large magnitudes and smaller in the opposite case. You can use this metric in recommendation systems to provide users with related content with preference to items the most similar to frequently visited items.
Hybrid Vector Search
Hybrid Vector Search (Hybrid Search) combines traditional keyword-based search such as full text search (FTS), which matches exact text or metadata with advanced methods such as Vector Search which matches content based on semantic similarity. Hybrid Search aims to enhance search capabilities by using both exact matches and contextual relevance to improve the overall accuracy and relevance of search results. See the following examples for more information on how to use Hybrid Search.
Vector Search will be performed on the documents that have been filtered based on the criteria specified in the WHERE clause. No LIMIT clause is required for Hybrid Vector Search.
See the Hybrid Search blog post for more information about Hybrid Search.