Node Availability

Auto-failover automatically fails over a node identified as unresponsive or unavailable.

Only Full Administrators and Cluster Administrators can enable or disable auto-failover.

This page explains how to configure auto-failover. See Automatic Failover for detailed conceptual information.

Using the UI

The Node Availability panel is as follows:

settings autofailover

The following checkboxes are provided:

  • Enable auto-failover after x seconds for up to y event: After the timeout period set here as x seconds has elapsed, an unresponsive or malfunctioning node is failed over, provided that the limit on actionable events set here as y has not yet been reached. Data replicas are promoted to active on other nodes, as appropriate. This feature can only be used when three or more nodes are present in the cluster. The number of seconds to elapse is configurable: the default is 120; the minimum permitted is 5; the maximum 3600. This option is selected by default.

  • Enable auto-failover for sustained data disk read/write failures after z seconds: After the timeout period set here as z seconds has elapsed, a node is failed over if it has experienced sustained data disk read/write failures. The timeout period is configurable: the default length is 120 seconds; the minimum permitted is 5; the maximum 3600. This checkbox can only be checked if Enable auto-failover after x seconds for up to y event has also been checked.

  • Enable auto-failover of server groups: Server-group failover is enabled. This checkbox (which can only be checked if Enable auto-failover after x seconds for up to y event has also been checked) should only be checked if three or more server groups have been established, and capacity is available to absorb the combined load of all potentially failed-over groups. For information on groups and Server Group Awareness, see Server Group Awareness.

The Node Availability screen contains the following, additional option, which is available for Ephemeral Buckets:

ephemeralBucketsReprovisioningInterface

Checking this checkbox ensures that if a node containing active Ephemeral Buckets becomes unavailable, its replicas on the specified number of other nodes are promoted to active status as appropriate, to avoid data-loss. Note, however, that this may leave the cluster in an unbalanced state, requiring a rebalance.

Using the CLI

The automatic failover settings can be changed using the setting-autofailover command in couchbase-cli.

Using the REST API

The automatic failover settings can be changed using the REST API /settings/autoFailover endpoint.

Below is an example of changing the automatic failover settings using the REST API:

curl -i -X POST -u Administrator:password http://10.142.180.103:8091/settings/autoFailover \
  -d 'enabled=true&timeout=72' \
  -d 'failoverServerGroup=true&maxCount=2' \
  -d 'failoverOnDataDiskIssues[enabled]=true&failoverOnDataDiskIssues[timePeriod]=89'