Cross Data Center Replication (XDCR)
- Capella Operational
Cross Data Center Replication (XDCR) allows data to be replicated across clusters that are potentially located in different clouds and different data centers.
Cross Data Center Replication (XDCR) can be used to replicate data between clusters. XDCR can protect against data-center failure, and also provide high-performance access to data for globally distributed mission-critical applications. Replications, once established, continuously replicate data until they are paused or deleted.
Replication Sources and Destinations
XDCR replicates data from a specific bucket on a source cluster to a specific bucket on a destination (or target) cluster.
If the source and/or destination buckets are deactivated, the replication will be removed. |
Data from the source bucket is pushed to the destination bucket using an XDCR agent, running on the source cluster, using the Cluster Change Protocol. Any bucket on a cluster can be specified as a source or a destination for one or more XDCR replications.
Replication sources and destinations can be on clusters in an organization. The source cluster can be in a different project from the destination cluster.
You cannot create a replication between clusters that are within two different cloud providers. |
About Scopes and Collections
All Capella clusters have the option to replicate data from a specific scope and collection on the source cluster to a specific scope and collection on a destination cluster. If identically named scopes and collections have been defined on the source and destination, data can optionally be replicated from each scope and collection on the source to to each corresponding, identically named scope and collection on the destination. Alternatively, replication can be explicitly configured to occur between differently named scopes and collections.
About Intra-cluster XDCR
XDCR is traditionally used for inter-cluster replication: the source bucket and the destination bucket are on different clusters. However, Capella also supports intra-cluster XDCR — where the source bucket and the destination bucket are on the same cluster.
To set up intra-cluster XDCR, you will specify the source cluster as the destination cluster when creating a replication.
What Can Be Replicated
XDCR only replicates bucket data; it does not replicate indexes. Indexes must be replicated manually, or by administrator-provided automation. When the index-definitions are pushed to the destination cluster, the indexes are regenerated there.
When encountered on the source cluster, non-UTF-8 encoded document IDs are automatically filtered out of replication: they are therefore not transferred to the destination cluster. For each such ID, the warning output is written to xdcr_errors.*
log files on the source cluster.
Replication Direction
XDCR can occur between source and destination clusters in either of the following ways:
- Unidirectional (One Way)
-
The data contained in a specified source bucket is replicated to a specified destination bucket. Although the replicated data on the destination cluster could be used for the routine serving of data, it is in fact intended principally as a backup, to support disaster recovery.
- Bidirectional (Two Way)
-
The data contained in a specified source bucket is replicated to a specified destination bucket; and the data contained in the destination bucket is, in turn, replicated back to the source bucket. This allows both buckets to be used for the serving of data, which may provide faster data access for users and applications in remote geographies.
Technically, XDCR only performs unidirectional replication. A bidirectional topology is created by implementing two unidirectional replications, in opposite directions, between two clusters; such that a bucket on each cluster functions as both source and destination.
When creating a replication from one Capella cluster to another, you will specify whether to make the replication bidirectional. If left unspecified, the replication will be unidirectional from the source bucket to the destination bucket and will appear under the Replication tab of the source cluster. If the replication is configured to be bidirectional, the replication will appear under the Replication tab of both the source cluster and the destination cluster.
When creating a replication from Capella to a self-managed cluster (which is a cluster established outside Capella), the replication can only be specified as unidirectional. This is because a replication from the self-managed cluster to Capella cannot be entirely configured on Capella: it needs partially to be configured on the self-managed cluster itself. For instructions, see Create a Replication to Capella from a Self-Managed Cluster.
Learn more about XDCR direction and topology in the Couchbase Server documentation.
Replication Filtering
XDCR filtering allows specified subsets of documents to be replicated from the source bucket. A document can be included in or excluded from a filtered replication, based on the document’s fields and values.
When a replication starts, the cluster examines the specified source bucket, and determines which documents to replicate:
-
If XDCR filtering is not applied, each document in the source bucket is replicated to the target.
-
If XDCR filtering is applied, each document in the source bucket is examined; but only those documents that meet the specified filtering criteria are replicated.
XDCR filters are configured when creating a replication. For more information on filters, see the Couchbase Server documentation.
Conflict Resolution
In some cases, especially when bidirectionally replicated data is being modified by applications in different locations, conflicts may arise: meaning that the data of one or more documents have been differently modified more or less simultaneously, requiring resolution. XDCR provides two types of conflict resolution, based on either sequence number or timestamp, whereby conflicted data can be saved consistently on source and target.
- Sequence Number
-
Conflicts can be resolved by referring to documents' sequence numbers. Sequence numbers are maintained per document, and are incremented on every document-update. The sequence numbers of source and target documents are compared, and the document with the higher sequence number prevails.
- Timestamp
-
Timestamp-based conflict resolution uses the document timestamp (stored in the CAS) to resolve conflicts. The timestamps associated with the most recent updates of source and target documents are compared. The document whose updates have the more recent timestamp prevails.
The type of conflict resolution that is used for a given replication is determined by the conflict resolution policy that is configured on the source and destination buckets. Conflict resolution is configured on a per-bucket basis at bucket creation time, and cannot be changed later.
Learn more about XDCR conflict resolution in the Couchbase Server documentation.
Administering Replications
For information on how to create, observe, pause, and delete replications, refer to Manage Replications.