Rebalance

Rebalance re-distributes data and indexes among available nodes.

Understanding Rebalance

When one or more nodes have been added to or removed from a cluster, rebalance redistributes data and indexes among available nodes. The cluster map is correspondingly updated and distributed to clients. (See Cluster Manager, for information on the cluster map.) The process occurs while the cluster continues to service requests.

Rebalance and Services

Rebalance affects different services differently, as described below.

Data Service

On rebalance, vBuckets are redistributed evenly among currently available Data Service nodes. (Note the special case provided by Swap Rebalance, where the number of nodes coming into the cluster is equal to the number of nodes leaving the cluster, ensuring that data is only moved between these nodes.) If nodes have been removed such that the desired number of replicas can no longer be supported, rebalance provides as many replicas as possible. After rebalance, operations are directed to active vBuckets in their updated locations. Rebalance does not interrupt applications' data-access.

vBucket data-transfer occurs sequentially: therefore, if rebalance stops for any reason, it can be restarted from the point at which it was stopped.

See Clusters and Availability, for information on how data is logically partitioned across Data Service cluster-nodes.

Index Service

The Index Service maintains a cluster-wide set of index definitions and metadata, which allows the redistribution of indexes and index replicas from removed nodes to nodes that continue as part of the cluster. Indexes that reside on non-removed nodes are unaffected by rebalance.

The rebalance process takes account of nodes' CPU and RAM utilization, and achieves the best resource-balance possible. Note that rebalance does not move indexes or replicas: instead, it rebuilds them in their new locations, using the latest data from the Data Service. If more index replicas exist than can be handled by the number of existing nodes, replicas are dropped: the numbers are automatically made up subsequently, if additional Index Service nodes are added to the cluster.

During rebalance, no index node is removed until index-building has completed on alternative nodes. This ensures uninterrupted access to indexes.

Search Service

The Search Service automatically partitions its indexes across all Search nodes in the cluster, ensuring that during rebalance, the distribution across all nodes is balanced.

Query Service

The addition or removal of Query Service nodes during rebalance is immediately effective: an added node is immediately available to serve queries; while a removed node is immediately unavailable, such that ongoing queries are interrupted, requiring the handling of errors or timeouts at application-level.

Eventing Service

When an Eventing Service node has been added or removed, rebalance causes vBucket processing ownership to be redistributed among available Eventing Service nodes. After rebalance, the service continues to process mutations: checkpoint information ensures that no mutations are lost.

Analytics Service

The Analytics Service uses shadow data, which is a single copy of a subset of the data maintained by the Data Service. The shadow data is not replicated; however, its single copy is partitioned across all cluster nodes that run the Analytics Service. If an Analytics node is permanently removed or replaced, all shadow data must be rebuilt, if and when the Analytics Service is restarted.

If no Analytics Service node has been removed or replaced, shadow data is not affected by rebalance. In consequence of rebalance, the Analytics Service receives an updated cluster map, and continues to work with the modified vBucket-topology.