Package com.couchbase.lite
Class VectorIndexConfiguration
java.lang.Object
com.couchbase.lite.IndexConfiguration
com.couchbase.lite.VectorIndexConfiguration
Configuration for creating vector indexes.
-
Nested Class Summary
Modifier and TypeClassDescriptionstatic enum
Distance metric to be used in the vector indexes. -
Constructor Summary
ConstructorDescriptionVectorIndexConfiguration
(String expression, long dimensions, long centroids) VectorIndexConfiguration Constructor. -
Method Summary
Modifier and TypeMethodDescriptionlong
The number of centroids which is the number buckets into which to partition the vectors in the index.long
The number of vector dimensions.Vector encoding type.The SQL++ expression returning a vector which is an array of 32-bits floating-point numbers or a Base64 string representing an array of 32-bits floating-point numbers in little-endian format.long
The maximum number of vectors used for training the index.Distance Metric typelong
The minimum number of vectors for training the index.long
The number of centroids that will be scanned during a query.boolean
isLazy()
The boolean flag indicating that index is lazy or not.setEncoding
(VectorEncoding encoding) setLazy
(boolean lazy) setMaxTrainingSize
(long maxTrainingSize) setMinTrainingSize
(long minTrainingSize) setNumProbes
(long numProbes) toString()
Methods inherited from class com.couchbase.lite.IndexConfiguration
getExpressions
-
Constructor Details
-
VectorIndexConfiguration
VectorIndexConfiguration Constructor. The number of centroids will be based on the expected number of vectors or documents containing the vectors to be indexed and the application may need to make some experiments to see find an optimal value to use; one simple rule which could be a starting point is to use the square root value of the number of vectors.- Parameters:
expression
- The SQL++ expression returning a vector which is an array of numbers.dimensions
- The number of dimensions of the vectors to be indexed. The vectors that do not have the same dimensions specified in the config will not be indexed. The dimensions must be between 2 and 4096 inclusively.centroids
- The number of centroids which is the number buckets to partition the vectors in the index. The number of centroids will be based on the expected number of vectors to be indexed; one suggested rule is to use the square root of the number of vectors. The centroids must be between 1 and 64000 inclusively.
-
-
Method Details
-
getExpression
The SQL++ expression returning a vector which is an array of 32-bits floating-point numbers or a Base64 string representing an array of 32-bits floating-point numbers in little-endian format.- Returns:
- the index expression.
-
getDimensions
public long getDimensions()The number of vector dimensions.- Returns:
- the number of dimensions
-
getCentroids
public long getCentroids()The number of centroids which is the number buckets into which to partition the vectors in the index.- Returns:
- the number of centroids
-
getEncoding
Vector encoding type. The default value is an 8-bit Scalar Quantizer. -
setEncoding
-
getMetric
Distance Metric typeThe default value is euclidean.
-
setMetric
@NonNull public VectorIndexConfiguration setMetric(@NonNull VectorIndexConfiguration.DistanceMetric metric) -
getMinTrainingSize
public long getMinTrainingSize()The minimum number of vectors for training the index. The default value is zero, meaning that minTrainingSize will be automatically calculated by the index based on the number of centroids specified, encoding types, and the encoding parameters. Note: The training will occur at or before the APPROX_VECTOR_DISANCE query is executed, provided there is enough data at that time and, consequently, if training is triggered during a query, the query may take longer to return results. If a query is executed against the index before it is trained, a full scan of the vectors will be performed. If there are insufficient vectors in the database for training, a warning message will be logged, indicating the required number of vectors.- Returns:
- the min number of vectors used in training the index
-
setMinTrainingSize
-
getMaxTrainingSize
public long getMaxTrainingSize()The maximum number of vectors used for training the index. The default value is zero, meaning that the maxTrainingSize will be automatically calulated by the index based on the number of centroids specified, encoding types, and the encoding parameters.- Returns:
- the max number of vectors used in training the index
-
setMaxTrainingSize
-
getNumProbes
public long getNumProbes()The number of centroids that will be scanned during a query. The default value is zero, meaning that the numProbes will be automatically calulated by the index based on the number of centroids specified -
setNumProbes
-
isLazy
public boolean isLazy()The boolean flag indicating that index is lazy or not. The default value is false. If the index is lazy, the index will not automatically updated when the documents in the collection are changed except when the documents are deleted or purged. When using the lazy index mode, the expression set in the config must refer to the values that will be used later for computing vectors for the index instead of vector embeddings or prediction() function that returns vectors. To update the index:- Use Collection's getIndex(name: String) to get the index.
- Call beginUpdate(int limit) on the index object with the max number of vectors to be updated into the index. The function will return an IndexUpdater object.
- For each of the `count` items, call a value getter such as getValue(at index: Int) on the IndexUpdater object to get the value for computing the vector. The returned value is based on the expression set to the VectorIndexConfiguration object used when creating the vector index.
- Call setVector() on the IndexUpdater object to set the computed vector or null to remove the index row.
- Call skipVector() on the IndexUpdater object to skip setting the vector. The value will be returned again next time when calling beginUpdate() function on the index object.
- Call finish() on the IndexUpdater object to save all the vector set to the IndexUpdater object.
- Returns:
- true if the index is lazy.
-
setLazy
-
toString
-