Rebalancing the Cluster

A cluster is rebalanced with the POST /controller/rebalance HTTP method and URI.

Description

Clusters can be rebalanced. When one or more nodes have been brought into a cluster, or have been taken out of a cluster, rebalance redistributes data and indexes among the available nodes. The cluster map is correspondingly updated and distributed to clients. The process occurs while the cluster continues to service requests for data.

A conceptual overview of nodes, and how they are combined into clusters, is provided in Nodes.

HTTP method and URI

POST /controller/rebalance

Curl Syntax

curl -v -X POST -u [admin]:[password]
  http://[localhost]:8091/controller/rebalance
  [-d knownNodes | -d ejectedNodes]

The parameters are:

  • knownNodes. Must always be specified. The value is a list containing all nodes that are in the cluster at the start of the rebalance process.

  • ejectedNodes. Specified when one or more nodes are to be ejected from the cluster by the rebalance process. The value is a list containing all nodes to be ejected.

    The node-list specified by ejectedNodes is thus a subset of that specified by knownNodes.

For information on rebalance, see Rebalance.

Response

Success gives the status 200 OK, with no object returned.

If either parameter specifies an unknown node, or if a node currently in the cluster is omitted from the value of knownNodes, the status 400 Bad Request is given, and the following object is returned:

{"mismatch":1}

If knownNodes is not specified and ejectedNodes is specified, the status 400 Bad Request is given, and the following object is returned:

{"empty_known_nodes":1}

Examples

The following examples demonstrate how to determine which nodes are currently in the cluster; how to rebalance after a hard failover; how to rebalance in order to eject a node from the cluster; and how to re-add an ejected node to a cluster.

For information on how to retrieve status on an ongoing rebalance, see Getting Rebalance Progress and Getting Cluster Tasks.

Return Cluster Nodes

To determine which nodes are currently in the cluster, use the GET /pools/default HTTP method and URI. In this example, the (extensive) output is piped to the jq tool to be formatted, and the lines featuring the hostname are then filtered by grep, to ensure readability.

curl  -u Administrator:password -X GET \
http://10.143.190.101:8091/pools/default | jq '.' | grep hostname

The output contains the following lines, which specify all nodes currently in the cluster:

"hostname": "10.143.190.101:8091",
"hostname": "10.143.190.102:8091",
"hostname": "10.143.190.103:8091",

For more information on this method and URI, see Viewing Cluster Details.

Perform Rebalance Following a Hard Failover

The hard failover of a node results in that node continuing to be a member of the cluster, but being unable to serve data. Since its active vBuckets are unavailable, corresponding replica vBuckets are promoted to active status on the still-available nodes. This means that data continues to be served, but is in an imbalanced state across the cluster. This requires rebalance.

For detailed conceptual information, see Hard Failover. For information on performing a hard failover, see Failing over Nodes.

To perform the rebalance, specify all cluster-nodes in the value for the knownNodes parameter, including the one or ones to which hard failover is known to have been applied:

curl  -u Administrator:password -v -X POST \
http://10.143.190.101:8091/controller/rebalance \
-d 'knownNodes=ns_1%4010.143.190.101%2Cns_1%4010.143.190.102%2Cns_1%4010.143.190.103'

On success, the status 200 OK is given, and no object is returned. The failed-over node has been removed, and the data is now distributed evenly across the surviving nodes. The successful node-removal is confirmed by again examining the current cluster nodes:

curl  -u Administrator:password -X GET \
http://10.143.190.101:8091/pools/default | jq '.' | grep hostname

The output indicates which nodes are still in the cluster:

"hostname": "10.143.190.101:8091",
"hostname": "10.143.190.102:8091",

This confirms that 10.143.190.103 has been removed.

Add a Node to a Cluster, then Rebalance

Adding a node to a cluster is a two-step process.

  1. The POST /controller/addNode HTTP method and URI are used to add the node. This allows service-deployment for the node to be specified. A placeholder username and password can be specified, when adding an unprovisioned node.

  2. The POST /controller/rebalance HTTP method and URI are used to rebalance the added node into the cluster. Include the new node in the knownNodes node-list.

For example, the following command adds node 10.143.1990.103 to the cluster from which it was removed, and assigns it the Data Service:

curl -u Administrator:password -X POST \
10.142.181.101:8091/controller/addNode \
-d 'hostname=10.143.190.103&user=someName&password=somePassword&services=kv'

If successful, this returns the following object, indicating that the node is now recognized as a member of the cluster:

{"otpNode":"ns_1@10.143.190.103"}

For more information on this method and URI, see Adding Nodes to Clusters.

Next, the added node is rebalanced into the cluster. This allows it to take its share of the data-distribution.

curl  -u Administrator:password -v -X POST \
http://10.143.190.101:8091/controller/rebalance \
-d 'knownNodes=ns_1%4010.143.190.101%2Cns_1%4010.143.190.102%2Cns_1%4010.143.190.103'

On success, the response code 200 OK is given, and no object is returned. The cluster is now rebalanced. At the conclusion, the cluster can again be checked for its current membership:

curl  -u Administrator:password -X GET \
http://10.143.190.101:8091/pools/default | jq '.' | grep hostname

The output now includes the following:

"hostname": "10.143.190.101:8091",
"hostname": "10.143.190.102:8091",
"hostname": "10.143.190.103:8091",

This confirms that 10.143.190.103 has been rebalanced into the cluster.

Eject a Node

To eject a node, use the POST /controller/rebalance HTTP method and URI. Specify the entire current node-list for the cluster as the value of the knownNodes parameter. Specify the list of nodes to be ejected as the value of the ejectedNodes parameter.

For example, the following command ejects node 10.143.190.103 from the cluster:

curl  -u Administrator:password -v -X POST \
http://10.143.190.101:8091/controller/rebalance \
-d 'ejectedNodes=ns_1%4010.143.190.103' \
-d 'knownNodes=ns_1%4010.143.190.101%2Cns_1%4010.143.190.102%2Cns_1%4010.143.190.103'

On success, the response code 200 OK is given, and no object is returned. At the conclusion, the cluster can again be checked for its current membership:

curl  -u Administrator:password -X GET \
http://10.143.190.101:8091/pools/default | jq '.' | grep hostname

The output now includes the following:

"hostname": "10.143.190.101:8091",
"hostname": "10.143.190.102:8091",

Adjusting Rebalance During Compaction

Description

If a rebalance is performed while a node is undergoing index compaction, rebalance delays may be experienced. The parameter, rebalanceMovesBeforeCompaction, is used to improve rebalance performance: potentially, this results in a larger index. This setting can be modified with the POST /internalSettings endpoint. By default, it is 64. This specifies that 64 vBuckets are to be moved per node; at which point all vBucket movement is paused, and index compaction is triggered. Since index compaction is therefore not performed while vBuckets are being moved, a large rebalanceMovesBeforeCompaction value results in the server spending less time compacting indexes; potentially resulting in larger index files, which take up more disk space.

For example:

curl -X POST -u Administrator:password 'http://10.5.2.54:8091/internalSettings' \
    -d 'rebalanceMovesBeforeCompaction=256'

See Also

For conceptual information on rebalance, see Rebalance. For information on how to retrieve status on an ongoing rebalance, see Getting Rebalance Progress and Getting Cluster Tasks.

For conceptual information on hard failover, see Hard Failover. For information on performing a hard failover with the REST API, see Failing over Nodes. For information on retrieving details of a cluster, including its current nodes, see Viewing Cluster Details.