Intra-Cluster Replication

March 23, 2025

+ 12

Intra-cluster replication replicates data across the nodes of a cluster, by means of the Database Change Protocol.

Intra-Cluster Replication

Intra-cluster replication involves replicating data across the nodes of a cluster.

Couchbase Data is defined logically to reside in Buckets. Each bucket, when defined, is implemented by Couchbase Server as vBuckets, up to 1024 of which thereby exist in memory (and, in the case of Couchbase Buckets, on disk); the exact number at any given time depending on the number of items to be stored. Items are associated with vBuckets by means of a hashing algorithm, and buckets are assigned to nodes according to a fixed mapping, determined and updated by the Master Services of the ns-server component of the Cluster Manager.

Up to three replica buckets can be defined for every bucket. Each replica itself is also implemented as 1024 vBuckets. A vBucket that is part of the original implementation of the defined bucket referred to as an active vBucket. Therefore, a bucket defined with two replicas has 1024 active vBuckets and 2048 replica vBuckets. Typically, only active vBuckets are accessed for read and write operations: although replica vBuckets are able to support read requests. Nevertheless, vBuckets receive a continuous stream of mutations from the active vBucket by means of the Database Change Protocol (DCP), and are thereby kept constantly up to date. To ensure maximum availability of data in case of node-failures, the Master Services for the cluster calculate and implement the optimal vBucket Distribution across available nodes: consequently, the chance of data-loss through the failure of an individual node is minimized, since replicas are available on the nodes that remain. This is shown by the following illustration:

The illustration shows active and replica vBuckets that correspond to a single, user-defined bucket, for which a single replication-instance has been specified. The first nine active vBuckets are shown, along with their nine corresponding, replica vBuckets; distributed across three server-nodes. The distribution of vBuckets indicates a likely distribution calculated by Couchbase Server: no replica resides on the same node as its active equivalent: therefore, should any one of the three nodes fail, its data remains available.

When a node becomes unavailable, failover can be performed: meaning that the Cluster Manager is instructed to read and write data only on those cluster-nodes that are still available. Failover can be performed by manual intervention, or automatically: as part of this process, where required, replica vBuckets are promoted to active status. Automatic Failover can be performed for one node at a time, up to a configurable number of times, the maximum being three. Automatic failover never occurs where data-loss might result.

For information on configuring replicas, see Create a Bucket. As a rule, configure one replica for a cluster of up to five nodes; one or two replicas for from five to ten nodes; and one, two, or three replicas for over ten nodes.

Note that intra-cluster availability can further be optimized by defining Groups.

Database Change Protocol (DCP)

Database Change Protocol (DCP) is the protocol used to stream bucket-level mutations. DCP is used for high-speed replication of mutations; to maintain replica vBuckets, incremental MapReduce, Global Secondary Indexes (GSIs), XDCR, backups, and external connections.

DCP is a memory-based replication protocol that is ordering, resumable, and consistent. DCP streams changes made in memory to items, by means of a replication queue.

DCP involves the following concepts:

Application client: A client external to the cluster. Transmits read, write, update, delete, and query requests to the server-cluster. See Compression, for an illustration.
DCP client: A Couchbase client that is either internal or external to the cluster. Streams data from one or more Couchbase Server-nodes, in support of intra-cluster replication, indexing, XDCR, services, incremental backup, connectors, and mobile synchronicity. See Compression, for an illustration.
Failover log: A list of previously known vBucket-versions, for a vBucket. If a client connects to a server, and was previously connected to a vBucket-version other than the current one, the failure log is used to find a rollback point.
History branch: If a replica vBucket does not contain the latest changes, but becomes active, its record of changes (referred to as a history branch) is recognized by DCP clients; and the vBucket duly re-versioned and updated. Such action, if required, must occur before any further major operation (such as a node addition or removal, or a swap-rebalance) can be performed.
Mutation: An event that deletes a key, or changes the value to which a key points. Mutations occur when transactions such as create, update, delete, or expire are executed.
Rollback point: The commencement of a recognized history branch, indicating the point in the mutation-sequence from which a vBucket must be brought up to date.
Sequence number: Given to each mutation to a vBucket, so that the sequence of mutations is recorded, and can be referenced by interested processes. The later the mutation, the higher the number. The sequence is strictly relevant to the vBucket, and does not provide a cluster-wide ordering of events.
Snapshot: A representation of the exact state of mutations on either a write queue or in storage, provided by Couchbase Server to an interested client.
vBucket stream: A grouping of messages related to receiving mutations for a specific vBucket. This includes mutation, deletion, expiration, and snapshot-marker messages.
vBucket version: A Universally unique identifier (UUID) and sequence-number pair associated with a vBucket. A new version is assigned to a vBucket by the new master-node whenever a history branch is recognized. The UUID is a randomly generated number; and the sequence number is the one last processed by the vBucket, at the time the version was created.