You are viewing the documentation for a prerelease version.

View Latest

CREATE INDEX

The CREATE INDEX statement allows you to create a secondary index. Secondary indexes contain a filtered or a full set of keys in a given bucket. Secondary indexes are optional but increase query efficiency on a bucket.

CREATE INDEX is by default a synchronous operation, which means that every CREATE INDEX statement blocks until the operation finishes. Index building starts by creating a task that is queued for index build. After this phase, if you lose connectivity, the index build operation continues in the background. You can also run index creation asynchronously using the defer_build clause. In the asynchronous mode, CREATE INDEX starts a task to create the index definition, and returns as soon as the task finishes. You can then build the index using the BUILD INDEX command.

Both GSI and View indexes provide a status field and mark index status pending. With GSI indexer, index status continues to report "pending". This status field and other index metadata can be queried using system:indexes.

Indexes cannot be built concurrently on a given bucket unless the defer_build option of the CREATE INDEX statement is used in combination with BUILD INDEX statement. The following error is reported if a second index creation operation is kicked off before the completion of the ongoing index creation.

[
  {
    "code": 5000,
    "msg": "BuildIndexes - cause: Build index fails.  Some index will be retried building in the background.  For more details, please check index status.\n",
    "query_from_user": "BUILD INDEX ON ..."
  }
]

You can create multiple identical secondary indexes on a bucket and place them on separate nodes for better index availability. In Couchbase Server Enterprise Edition, the recommended way to do this is using the num_replicas option. In Couchbase Server Community Edition, you need to create multiple identical indexes and place them using the nodes option. Refer to WITH Clause below for more details.

RBAC Privileges

User executing the CREATE INDEX statement must have the Query Manage Index privilege granted on the keyspace/bucket. For more details about user roles, see Authorization.

Syntax

create-index ::= CREATE INDEX index-name ON named-keyspace-ref '(' index-key [ ',' index-key ]* ')' [ where-clause ] [ index-using ] [ index-with ]
'CREATE' 'INDEX' index-name 'ON' named-keyspace-ref '(' index-key ( ',' index-key )* ')' where-clause? index-using? index-with?
index-name

[Required] A unique name that identifies the index.

Valid GSI index names can contain any of the following characters: A-Z a-z 0-9 # _, and must start with a letter, [A-Z a-z]. The minimum length of an index name is 1 character and there is no maximum length set for an index name. When querying, if the index name contains a # or _ character, you must enclose the index name within backticks.

Named Keyspace Reference

keyspace-ref ::= [ namespace ':' ] keyspace
(namespace ':')? keyspace
namespace

(Optional) An identifier that refers to the namespace of the bucket to be indexed. Currently, only the default namespace is available. If the namespace name is omitted, the default namespace in the current session is used.

keyspace

(Required) An identifier that refers to the bucket name or keyspace. It specifies the bucket as the source for which the index needs to be created.

For example, default:`travel-sample` indicates the travel-sample keyspace in the default namespace.

If there is a hyphen (-) inside the keyspace or bucket name, you must wrap it in backticks (` `).

Index Key

index-key ::= expr | array-expr
expr | array-expr

Refers to an attribute name or a scalar function or an ARRAY expression on the attribute. This constitutes an index-key for the index.

expr

A N1QL expression over any fields in the document. This cannot use constant expressions, aggregate functions, or sub-queries.

array-expr

An array expression. For details about array expressions, see Array Indexing.

WHERE Clause

where-clause ::= WHERE cond
'WHERE' cond
cond

Specifies WHERE clause predicates to qualify the subset of documents to include in the index.

USING Clause

index-using ::= USING ( VIEW | GSI )
'USING' ( 'VIEW' | 'GSI' )

The USING clause specifies the index type to use. Secondary indexes can be created using global secondary indexes (GSI) or views.

This clause is optional; if omitted, the default is USING GSI.

WITH Clause

index-with ::= WITH expr
'WITH' expr

Use the WITH clause to specify additional options.

expr

An object with the following properties:

nodes

[Optional] An array of strings, each of which represents a node name.

In Couchbase Server Community Edition, a single global secondary index can be placed on a single node that runs the indexing service. The nodes property allows you to specify the node that the index is placed on. If nodes is not specified, one of the nodes running the index service is randomly picked for the index.

In Couchbase Server Enterprise Edition, multiple nodes can be specified to distribute replicas of an index amongst multiple nodes, for example:

CREATE INDEX productName_index1 ON bucket_name(productName, ProductID)
WHERE type="product" USING GSI
WITH {"nodes":["node1:8091", "node2:8091", "node3:8091"]};

If specifying both nodes and num_replica, the number of nodes in the array must be one greater than the specified number of replicas otherwise the index creation will fail.

If nodes is not specified, then nodes running the index service are randomly selected to host the index, based on the number of replicas.

A node name passed to the nodes property must include the cluster administration port, by default 8091. For example WITH {"nodes": ["192.0.2.0:8091"]} instead of WITH {"nodes": ["192.0.2.0"]}.
defer_build

[Optional] Boolean.

true

When set to true, the CREATE INDEX operation queues the task for building the index but immediately pauses the building of the index of type GSI. Index building requires an expensive scan operation. Deferring building of the index with multiple indexes can optimize the expensive scan operation. Admins can defer building multiple indexes and, using the BUILD INDEX statement, multiple indexes to be built efficiently with one efficient scan of bucket data.

false

When set to false, the CREATE INDEX operation queues the task for building the index and immediately kicks off the building of the index of type GSI.

num_replica
This property is only available in Couchbase Server Enterprise Edition.

[Optional] Integer that specifies the number of replicas of the index to create.

The indexer will automatically distribute these replicas amongst index nodes in the cluster for load-balancing and high availability purposes. The indexer will attempt to distribute the replicas based on the server groups in use in the cluster where possible.

If the value of this property is not less than the number of index nodes in the cluster, then the index creation will fail.

We recommend that you do not create (or drop) secondary indexes when any node with a secondary index role is down as this may result in duplicate index names.

Usage

Array Indexing

Array indexing enables you to create global indexes on array elements and optimize the execution of queries involving array elements. For details, refer to Array Indexing.

Index Partitioning

Index partitioning helps increase the query performance by dividing and spreading a large index of documents across multiple nodes, horizontally scaling out an index as needed. For details, refer to Index Partitioning.

Using the meta().id Function

For details, refer to Indexing Meta Info.

Using Indexes for Aggregates

If there is an index on the expression of an aggregate, that index may be used to satisfy the query. For example, given the index alt_idx created using the following statement:

CREATE INDEX alt_idx ON `travel-sample`(geo.alt);

The query engine will use the index alt_idx for the following query:

SELECT MIN(geo.alt), MAX(geo.alt) FROM `travel-sample`;

For details, refer to Operator Pushdowns.

Index Replicas

In the Indexes screen in the Couchbase Web Console, index replicas are marked with their replica ID.

The Indexes screen showing an index and index replica with replica ID

If you select view by server node from the drop-down menu, you can see the server node where each index and index replica is placed.

You can also use the Index service REST API to find the ID of an index replica and see which node it is placed on.

Examples

Example 1. Create an index

Create a secondary index that contains airports with an alt value greater than 1000 on the node 192.0.2.1.

CREATE INDEX over1000 ON `travel-sample`(geo.alt) WHERE geo.alt > 1000 USING GSI WITH {"nodes": ["192.0.2.1:8091"]};
Example 2. Create a deferred index

Create a secondary index with the defer_build option.

CREATE INDEX `travel-sample-type-index` ON `travel-sample`(type) USING GSI
   WITH {"defer_build":true};

Query system:indexes for the status of the index.

SELECT * FROM system:indexes WHERE name="travel-sample-type-index";
Results
[
  {
    "indexes": {
      "datastore_id": "http://127.0.0.1:8091",
      "id": "91e28fbb76aa93f6",
      "index_key": [
        "`type`"
      ],
      "keyspace_id": "travel-sample",
      "name": "travel-sample-type-index",
      "namespace_id": "default",
      "state": "deferred", (1)
      "using": "gsi"
    }
  }
]
1 The travel-sample-type-index is in the pending state (deferred).
Example 3. Build a deferred index

Kick off a deferred build using the index name.

BUILD INDEX ON `travel-sample`(`travel-sample-type-index`) USING GSI;

Query system:indexes for the status of the index.

SELECT * FROM system:indexes WHERE name="travel-sample-type-index";
Results
[
  {
    "indexes": {
      "datastore_id": "http://127.0.0.1:8091",
      "id": "91e28fbb76aa93f6",
      "index_key": [
        "`type`"
      ],
      "keyspace_id": "travel-sample",
      "name": "travel-sample-type-index",
      "namespace_id": "default",
      "state": "online", (1)
      "using": "gsi"
    }
  }
]
1 The index has now been created.