A newer version of this documentation is available.

View Latest

Delta node recovery

Delta node recovery permits nodes to be re-added to the cluster and incrementally caught up with the data changes.

Delta node recovery permits a failed over node to be recovered by re-using the data on its disk and then resynchronizing the data based on the delta change. The failed over node is checked to identify the point when data mutations stopped and resynchronized beginning at that point. The server node catches up on data mutations for its vBuckets and starts serving data. Because the original data and data buckets are retained, the cluster starts functioning with minimal downtime. This operation improves recovery time and network resource usage. Server nodes are removed from clusters under many circumstances. The following are circumstances (among many others) where a server node might be re-added to the cluster after being failed over.

  • Node goes down for a short period of time

  • Routine maintenance is scheduled

  • Network connectivity is briefly disrupted

When a node is failed over, data files are preserved. The data files are used either for Couchbase support, data recovery, or delta node recovery.

In the process of failing over a node, performing maintenance, adding the node back into the cluster, and rebalancing, data is recovered via either full recovery mode or delta recovery mode. With delta recovery mode, Couchbase detects (with the Database Change Protocol) which data files are up-to-date and which are out-of-date and then, during rebalance, the existing data files on the failed over server node are retained and the out-of-date files are updated.

From the web console, the Delta Recovery and Full Recovery options display after the server node is failed over. Both recovery methods add the server node back into the cluster during the rebalance operation, however, full recovery removes the node’s data prior to the rebalance and delta recovery schedules the node’s existing data to be re-used.

Delta recovery requirements:

  • A healthy server node and a healthy state for the cluster.

  • The server node is failed over. Delta recovery is not possible for a rebalance-in operation (add server) or rebalance-out operation (remove server).

  • Delta recovery must be possible for all buckets. For example, if delta recovery is possible for a subset of buckets but not possible for another subset of buckets, then the Couchbase cluster does not permit a rebalance operation.

  • Because delta recovery relies on the existing data files on the failed over server node’s disk, the exact same set of buckets must be transferred to the failed over server node.

Delta recovery characteristics:

  • Data files are "warmed up" into memory. Warmed up into memory means that data is loaded into memory. As a minimum, depending on whether metadata is retained in or not, all data file keys are loaded from disk prior to the rebalance operation.

  • Indexes must be rebuilt on the server node that is being re-added.

  • Use in deployments where data size is much greater than RAM size, bucket eviction is set to full eviction (metadata is not retained in memory), and indexes are not defined.

An environment with a large data footprint might use delta node recovery when re-adding a failed over server node.