Introduction to Full Text Search

    +
    Full Text Search (FTS) lets you create, manage, and query indexes, defined on JSON documents within a Couchbase bucket.

    Provided by the Search Service, full text search (FTS) enables the users to create, manage, and query multi-purposed indexes defined on JSON documents within a Couchbase bucket.

    In addition to exact matches, the full-text index can perform various search functions based on matching given terms/search parameters.

    Couchbase’s Global Secondary Indexes (GSI) can be used for range scans and regular pattern search, whereas FTS offers extensive capabilities for natural-language querying.

    Every Full Text Search is performed on a user-created Full Text Index, which contains the targets on which searches are to be performed: these targets are values derived from the textual and other contents of documents within a specified bucket.

    Full Text Search provides Google-like search capability on JSON documents. Couchbase’s Global Secondary Indexes (GSI) can be used for range scans and regular pattern search, whereas FTS provides extensive capabilities for natural-language querying. The query below looks for documents with all of the strings ("paris", "notre", "dame").

    Example
    {
      "explain": false,
      "fields": [
        "*"
      ],
      "highlight": {},
      "query": {
        "query": "+paris +notre +dame"
       }
    }

    This query returns the following result (shown partially) from the FTS index scan on the travel-sample sample bucket. For each matched document, the hits field shows the document id, the score, the fields in which a matched string occurs, and the position of the matched string.

    Results
    "hits": [
        {
          "index": "trsample_623ab1fb8bfc1297_6ddbfb54",
          "id": "landmark_21603",
          "score": 2.1834097375254955,
          "locations": {
            "city": {
              "paris": [
                {
                  "pos": 1,
                  "start": 0,
                  "end": 5,
                  "array_positions": null
                }
              ]
            },
            "content": {
              "dame": [
                {
                  "pos": 23,
                  "start": 169,
                  "end": 173,
                  "array_positions": null
                },
    // ...
              ]
            }
          }
        }
    ]

    Examples of natural language support include:

    • Language-aware searching; allowing users to search for, say, the word traveling, and additionally obtain results for travel and traveler.

    • Scoring of results, according to relevancy; allowing users to obtain result-sets that only contain documents awarded the highest scores. This keeps result-sets manageably small, even when the total number of documents returned is extremely large.

    • Fast indexes, which support a wide range of possible text-searches.

    Stages of a Full Text Search Query

    A Full Text Search query , once built at the client, can be targeted to any server in the Couchbase cluster hosting the search service.

    Here are the stages it goes through:

    1. The server that the client targets the search request to, assumes the role of the orchestrator or the coordinating node once it receives the external request.

    2. The coordinating node first looks up the index (making sure it exists).

    3. The coordinating node obtains the “plan” that the index was deployed with. The plan contains details on how many partitions the index was split into and all the servers’ information where any of these partitions reside.

    4. The coordinating node sets up a unique list of servers that it needs to dispatch an “internal” request to. A server in the Couchbase cluster is eligible if and only if it hosts a partition belonging to the index under consideration.

    5. Once the internal requests have been dispatched by the coordinating node to each of the servers, it’ll wait to hear back from them. Simultaneously, if any of the index’s partitions are resident on the coordinating node – search requests are dispatched to each of those partitions as well (disk-bound).

    6. Those servers in the cluster that receive the “internal” request from the coordinating node will forward it to each of the index partitions they host (disk-bound).

    7. Separate search requests that are dispatched concurrently to all index partitions resident within a server, and the server waits to hear back from them.

    8. Once the server hears back from all the partitions it hosts, it merges the results obtained from each of the partitions before packaging them into a response and shipping it back to the coordinating node.

    The coordinating node waits for responses from:

    • each of the index partitions resident within the node;

    • each of the servers in the cluster that it dispatched the internal request to.

    Once all the results from the local index partitions and the remote index partitions are obtained, the coordinating node merges all of them, packages them into a response, and ships them back to the client where the request originated.

    Full Text Search is powered by Bleve, an open source search and indexing library written in Go. Full Text Search uses Bleve for the indexing of documents and also makes available Bleve’s extensive range of query types. These include:

    Full Text Search includes pre-built text analyzers for the following languages: Arabic, CJK characters (Chinese, Japanese, and Korean), English, French, Hindi, Italian, Kurdish, Persian, and Portuguese. Additional languages have been added to Couchbase Server.

    To access Full Text Search, users require appropriate roles. The role FTS Admin must therefore be assigned to those who intend to create indexes; and the role FTS Searcher to those who intend to perform searches. For information on creating users and assigning roles, see Authorization.