Run a Vector Search with the Capella UI

  • Capella Operational
  • how-to
    +
    Run a Vector Search query from the Couchbase Capella UI to preview and test the search results from a Search Vector Index.

    For more information about how the Search Service scores documents in search results, see Scoring for Search Queries.

    Prerequisites

    • You have the Search Service enabled on a node in your operational cluster. For more information about how to change Services on your operational cluster, see Modify a Paid Cluster.

    • You have created a Search Vector Index.

      For more information about how to create a Search Vector Index, see Create a Search Vector Index in Quick Mode.

      You can import a sample dataset to use with the procedure or examples on this page.

      Go to Data Tools  Import from your cluster and import the color-vector-sample sample data.

      For the best results, consider using the sample Search Vector Index from Create a Search Vector Index in Quick Mode.

    • You have logged in to the Couchbase Capella UI.

    Procedure

    To run a Vector Search with the Capella UI:

    1. On the Operational Clusters page, select the operational cluster where you created your Search index.

    2. Go to Data Tools  Search.

    3. Next to your Search Vector Index, click Search.

    4. In the Search field, enter a search query.

    5. Press Enter or click Search.

    6. (Optional) To view a document and its source collection, click a document name in the search results list.

    Example: Running a Simple Vector Similarity Query

    For example, the following query searches for the top 2 vectors similar to the vector [ 0, 0, 128 ] in the colorvect_l2 field:

    {
        "fields": ["*"], 
        "knn": [
          {
            "k": 2, 
            "field": "colorvect_l2", 
            "vector": [ 0, 0, 128 ]
          }
        ]
    }

    The Search query is only a Vector Search query. It only returns the k number of similar vectors.

    The Search Service combines the Vector search results from a knn object with the traditional query object by using an OR function. If the same documents match the knn and query objects, the Search Service ranks those documents higher in search results.

    The document for the color navy should be the first result, followed by a similar color.

    Example: Running a Simple Hybrid Search Query

    The following hybrid Search query searches for the top vector similar to the vector [ 0, 0, 128 ] in the colorvect_l2 field. It also runs a Numeric Range Query on the brightness field to only return colors that have a brightness value between 70 and 80:

    {
        "fields": ["*"], 
        "query": { 
          "min": 70,
          "max": 80,
          "inclusive_min": false,
          "inclusive_max": true,
          "field": "brightness"
        }, 
        "knn": [
          {
            "k": 1, 
            "field": "colorvect_l2", 
            "vector": [ 0, 0, 128 ]
          }
        ]
    }

    The Search Service combines the Vector search results from a knn object with the traditional query object by using an OR function. If the same documents match the knn and query objects, the Search Service ranks those documents higher in search results.

    The document for the color navy should be the first result, followed by colors that are similar and match the brightness field query.

    If you want to run a hybrid Search query on a large, partitioned Search index and your cluster is on Couchbase Server version 8.0 or later, use the bm25 scoring model for your Search index. For more information, see Configure Additional Search Index Settings, Scoring for Search Queries, or Scoring Model.

    Example: Running a Semantic Search Query with a Large Embedding Vector

    The following query searches for matches to a large embedding vector, generated by the OpenAI embedding model, text-embedding-ada-002-v2.

    You can find generated embedding vectors for each color’s description field in rgb.json.

    This query should return the document for the color navy, based on a generated embedding vector for:

    What is a classic, refined hue that exudes elegance and is often linked to power and stability?

    The following shows part of the sample Search query:

    {
        "fields": ["*"],
        "knn": [
          {
            "field": "embedding_vector_dot",
            "k": 3,
            "vector": [
              0.024032991379499435,
              -0.009131478145718575,
              0.013961897231638432,
              -0.024734394624829292,
              -0.020605377852916718,
              0.006739427801221609,
              -0.012539239600300789,
              0.0063192471861839294,
              0.000004374724539957242,
              -0.030252983793616295,
              -0.010944539681077003,
              -0.0012845275923609734,
              0.0059850881807506084,
              -0.006388725712895393,
              -0.016304319724440575,
              0.03046472743153572,
              0.029988301917910576,
              -0.013121536932885647,
              0.01815708354115486,
              -0.011096730828285217,
              -0.0423753522336483,
              -0.0023523480631411076,
              -0.00022332418302539736,
              -0.0024681459181010723,
              -0.02911485731601715,

    Due to the size of the embedding vector, only part of the full query is being displayed in the documentation.

    Click View to view and copy the entire Vector Search query payload. Make sure you remove the lines for // tag::partial[] and // end::partial[].

    Example: Running a Semantic Search Query with a base64 Encoded String

    If your operational cluster is running Couchbase Server version 7.6.2 or later, you can use vectors encoded as base64 strings with Vector Search. For example, the following document describes the color navy, with base64 encoded strings in the embedding_vector_dot and colorvect_l2 fields instead of arrays:

    {
        "id": "#000080",
        "color": "navy",
        "brightness": 14.592,
        "colorvect_l2": "AACA",
        "wheel_pos": "other",
        "verbs": [
            "deep",
            "rich",
            "sophisticated"
        ],
        "description": "Navy is a deep, rich color that exudes sophistication. It is a dark shade of blue that is often associated with authority, stability, and elegance. Navy is a versatile color that can be both bold and understated, making it a popular choice in fashion and interior design. It is a timeless color that never goes out of style and adds a touch of sophistication to any look or space.",
        "embedding_model": "text-embedding-ada-002-v2",
        "embedding_vector_dot": ""
    },

    The following query uses a base64 encoded string for the same query as Running a Semantic Search Query with a Large Embedding Vector to return the document for navy:

    {
        "fields": ["*"],
        "knn": [
          {
            "field": "embedding_vector_dot",
            "k": 3,
            "vector_base64": ""
          }
        ]
    }
    You can only use base64 encoded strings in your Vector Search queries if your documents use base64 encoded strings, indexed with the vector_base64 field data type. You cannot search for and return vectors you indexed as arrays with the vector field data type by using a Search query with a base64 encoded string.

    Example: Use global_scoring With a bm25 Search Index

    In the following example, the Search query uses both a query and knn object to run both a Vector Search and traditional Search query on an index named products-index.

    The query searches for a specific embedding vector generated from an ecommerce website’s product description. The Search vector is generated from the phrase long battery life wireless earbuds. The query object specifically searches for documents that have Electronics as their category, with a price between 100.00 and 300.00. The query returns the description, price, and product_name fields in results. Since the query is on a large, partitioned index and uses the bm25 scoring algorithm, the query also uses global_scoring to keep document scores consistent across the Search index’s partitions:

    {
        "fields": ["description", "price", "product_name"],
        "query": {
          "conjuncts": [
            {
              "term": "Electronics",
              "field": "category"
            },
            {
              "field": "price",
              "min": 100.00,
              "max": 300.00,
              "inclusive_max": true
            }
          ]
        },
        "knn": [
          {
            "k": 5,
            "field": "embedding",
            "vector": [0.23, -0.75, 0.61, ...]
          }
        ],
        "ctl": {
          "global_scoring": true
        }
    }
    The vector embedding has been truncated for this example. The vector embedding in your Search query must match the configured dimension and similarity metric for your Search index.

    For more information about the bm25 scoring algorithm, see Scoring for Search Queries.

    Next Steps

    If you do not get the search results you were expecting, you can change the JSON payload for your Search query.