Rebalance redistributes data, indexes, event processing, and query processing among available nodes.
When one or more nodes have been brought into a cluster (either by adding or joining), or have been taken out of a cluster (either through Removal or Failover), rebalance redistributes data, indexes, event processing, and query processing among available nodes. The cluster map is correspondingly updated and distributed to clients. The process occurs while the cluster continues to service requests for data.
Each rebalance proceeds in sequential stages. Each stage corresponds to a Couchbase Service, deployed on the cluster. Therefore, if all services have been deployed, there are seven stages in all — one each for the Data, Query, Index, Search, Eventing, Backup, and Analytics services. When all stages have been completed, the rebalance process itself is complete.
On rebalance, vBuckets are redistributed evenly among currently available Data Service nodes. After rebalance, operations are directed to active vBuckets in their updated locations. Rebalance does not interrupt applications' data-access. vBucket data-transfer occurs sequentially: therefore, if rebalance stops for any reason, it can be restarted from the point at which it was stopped.
Note the special case provided by Swap Rebalance, where the number of nodes coming into the cluster is equal to the number of nodes leaving the cluster, ensuring that data is only moved between these nodes.
If nodes have been removed such that the desired number of replicas can no longer be supported, rebalance provides as many replicas as possible. For example, if four Data Service nodes previously supported one bucket with three replicas, and the Data Service node-count is reduced to three, rebalance provides two replicas only. If and when the missing Data Service node is restored or replaced, rebalance will provide three replicas again.
See Intra-Cluster Replication, for information on how data is distributed across nodes.
During the Data Service rebalance stage, vBuckets are moved in phases. The phases — which differ, depending on whether the vBucket is an active or a replica vBucket — are described below.
The phases through which rebalance moves a replica vBucket are shown by the following illustration.
The move has two principal phases. Phase 1 is Backfill. Phase 2 is Book-keeping.
Phase 1, Backfill, itself consists of two subphases. The first subphase comprises the movement of the replica vBucket data from its node of origin to the memory of the destination node. The second subphase comprises the writing of the replica vBucket data from the memory to the disk of the destination node. The time required for this second subphase, which only applies to Couchbase Buckets, is termed Persistence Time. The time required for the entire Backfill process, including Persistence Time, is termed Backfill Time.
Phase 2, Book-keeping, comprises various ancillary tasks required for move-completion.
The total time required for the move is calculated by adding Backfill Time to the time required for Phase 2, Book-keeping; and is termed Move Time.
The phases in which rebalance moves an active vBucket are shown by the following illustration.
The move has four principal phases. Phase 1, Backfill, and Phase 2, Book-keeping, are identical to those required for replica vBuckets; except that the Book-keeping phase includes additional Persistence Time.
Phase 3, Active Takeover, comprises the operations required to establish the relocated vBucket as the new active copy. The time required for Phase 3 is termed Takeover Time.
Phase 4, Book-keeping, comprises a final set of ancillary tasks, required for move-completion.
The total time for the move is termed Move Time.
Since vBucket moves are highly resource-intensive, Couchbase Server allows the concurrency of such moves to be limited: a setting is provided that determines the maximum number of concurrent vBucket moves permitted on any node.
The minimum value for the setting is
1, the maximum
64, the default
A move counts toward this restriction only when in the backfill phase, as described above, in Data Service Rebalance Phases. The move may be of either an active or a replica vBucket. A node’s participation in the move may be as either a source or a target.
For example, if a node is at a given time the source for two moves in backfill phase, and is the target for two additional moves in backfill phase, and the setting stands at
4, the node may participate in the backfill phase of no additional moves, until at least one of its current moves has completed its backfill phase.
A higher setting may improve rebalance performance, at the cost of higher resource consumption; in terms of CPU, memory, disk, and bandwidth. Conversely, a lower setting may degrade rebalance performance, while freeing up such resources. Note, however, that rebalance performance can be affected by many additional factors; and that in consequence, changing this parameter may not always have the expected effects. Note also that a higher setting, due to its additional consumption of resources, may degrade the performance of other systems, including the Data Service.
Couchbase Server creates a report on every rebalance that occurs. The report contains a JSON document, which can be inspected in any browser or editor. The document provides summaries of the concluded rebalance activity, as well as details for each of the vBuckets affected: in consequence, the report may be of considerable length.
On conclusion of a rebalance, its report can be accessed in any of the following ways:
By means of Couchbase Web Console, as described in Add a Node and Rebalance.
By means of the REST API, as described in Getting Cluster Tasks.
By accessing the directory
/opt/couchbase/var/lib/couchbase/logs/rebalanceon any of the cluster nodes. A rebalance report is maintained here for (up to) the last five rebalances performed. Each report is provided as a
*.jsonfile, whose name indicates the time at which the report was run — for example,
A complete account of the report-content is provided in the Rebalance Reference.
Rebalance affects different services differently. The effects on services other than the Data Service are described below.
The Index Service maintains a cluster-wide set of index definitions and metadata, which allows the redistribution of indexes and index replicas during a rebalance.
The rebalance process takes account of nodes' CPU and RAM utilization, and achieves the best resource-balance possible. Note that rebalance does not move indexes or replicas: instead, it rebuilds them in their new locations, using the latest data from the Data Service.
In Couchbase Server 7.0 and later, the index redistribution setting enables you to specify how Couchbase Server redistributes indexes on rebalance. The setting may be established by means of the Couchbase Web Console or the REST API.
The setting affects how indexes are redistributed in the following scenarios:
- Rebalance after an index node is added
If the setting is enabled, existing partitioned and non-partitioned indexes are placed optimally across all index nodes in the cluster, including any new index nodes being added. If the setting is disabled, only partitioned indexes are redistributed.
- Rebalance after a non-index node is added or removed
If the setting is enabled, partitioned and non-partitioned indexes are moved from heavily loaded nodes to nodes with free resources to achieve balanced distribution. If the setting is disabled, only partitioned indexes are redistributed.
- Rebalance during index server group repair
With multiple server groups, when a group failure leads to all replicas being placed in a single server group, if the setting is enabled, partitioned and non-partitioned indexes are redistributed to ensure high availability across the server groups after repair. If the setting is disabled, only partitioned indexes are redistributed.
The setting does not affect how indexes are redistributed in the following scenarios:
- Rebalance when an index node is removed
Partitioned and non-partitioned indexes are moved on rebalance from removed nodes to nodes that continue as part of the cluster. Indexes that reside on non-removed nodes are unaffected by rebalance.
- Rebalance when index nodes are added and removed
During a swap rebalance, indexes from ejected nodes are placed on the nodes being added.
If more index replicas exist than can be handled by the number of existing nodes, replicas are dropped: the numbers are automatically made up subsequently, if additional Index Service nodes are added to the cluster.
During rebalance, no index node is removed until index-building has completed on alternative nodes. This ensures uninterrupted access to indexes.
The Search Service automatically partitions its indexes across all Search nodes in the cluster, ensuring optimal distribution, following rebalance. To achieve this, in all versions prior to 7.0.2, partitions needing to be newly created are entirely built, on their newly assigned nodes.
In 7.0.2+, this remains the default way of creating new partitions during rebalance. However, partition-creation by file transfer is also provided, as a significant performance enhancement. This option allows files to be transferred from old nodes to new; so that overheads associated with partition build are avoided.
This is an Enterprise-only feature, and requires all Search-Service nodes to be running 7.0.2+. Community Edition clusters that are upgraded to Enterprise Edition 7.0.2+ can have this feature switched on, subsequent to upgrade.
During file transfer, should an unresolvable error occur, file transfer is automatically abandoned, and partition build is used instead.
The file-transfer feature can be enabled and disabled by means of the 7.0.2+ REST API. See Rebalance Based on File Transfer.
The addition or removal of Query Service nodes during rebalance is immediately effective: an added node is immediately available to serve queries; while a removed node is immediately unavailable, such that ongoing queries are interrupted, requiring the handling of errors or timeouts at application-level.
When an Eventing Service node has been added or removed, rebalance causes the mutation (vBucket processing ownership) and timer event processing workload to be redistributed among available Eventing Service nodes. The eventing service continues to process mutations both during and after rebalance. Checkpoint information ensures that no mutations are lost.
The Analytics Service uses shadow data, which is a single copy of a subset of the data maintained by the Data Service. The shadow data is not replicated; however, its single copy is partitioned across all cluster nodes that run the Analytics Service. If an Analytics node is permanently removed or replaced, all shadow data must be rebuilt, if and when the Analytics Service is restarted.
If no Analytics Service node has been removed or replaced, shadow data is not affected by rebalance. In consequence of rebalance, the Analytics Service receives an updated cluster map, and continues to work with the modified vBucket-topology.
A rebalance causes the scheduler for the Backup Service to stop running. This means that no new backup tasks are triggered until the rebalance has concluded; at which point, the scheduler restarts, and reconstructs the task schedule. Then, the triggering of Backup Service tasks is resumed.
Note that a rebalance has the effect of restarting the Backup Service whenever the service has previously been stopped, due to loss of its leader: for information, see Backup-Service Architecture.
Rebalance failures can optionally be responded to automatically, with up to 3 retries. The number of seconds required to elapse between retries can also be configured. For information on configuration options, see General Settings. For information on failure-notifications, and options for cancelling rebalance-retries, see Automated Rebalance Failure Handling.