Cross Data Center Replication (XDCR)

    +
    Cross Data Center Replication (XDCR) allows data to be replicated across clusters that are potentially located in different clouds and different data-centers.

    Cross Data Center Replication (XDCR) can be used to replicate data between clusters. XDCR can provide protection against data-center failure, and also provide high performance access to data for globally distributed, mission critical applications. Replications, once established, continuously replicate data until they are paused or deleted.

    Replication Sources and Destinations

    XDCR replicates data from a specific bucket on a source cluster to a specific bucket on a destination cluster. Data from the source bucket is pushed to the destination bucket by means of an XDCR agent, running on the source cluster, using the Database Change Protocol. Any bucket on any cluster can be specified as a source or a destination for one or more XDCR replications.

    Replication sources and destinations can be on any cluster in an organization. The source cluster can be in a different project than the destination cluster. Source and destination clusters can also be on different connected clouds — with replications being sent efficiently over your cloud provider’s regional networks.

    XDCR is also supported between Couchbase Cloud clusters and self-managed clusters that you operate outside of Couchbase Cloud. You can connect a self-managed cluster within a project, and then treat it as a source and destination cluster exactly as you would any other cluster in Couchbase Cloud.

    About Intra-Cluster XDCR

    XDCR is traditionally used for inter-cluster replication: the source bucket and the destination bucket are on different clusters. However, Couchbase Cloud also supports intra-cluster XDCR — where the source bucket and the destination bucket are on the same cluster.

    To set up intra-cluster XDCR, you will specify the source cluster as the destination cluster when creating a replication.

    What Can Be Replicated

    XDCR only replicates bucket data; it does not replicate indexes. You’ll need to reproduce indexes manually. Indexes can only be replicated manually, or by administrator-provided automation: when the definitions are pushed to the destination cluster, the indexes are regenerated there.

    When encountered on the source cluster, non-UTF-8 encoded document IDs are automatically filtered out of replication: they are therefore not transferred to the destination cluster. For each such ID, the warning output is written to xdcr_errors.* log files on the source cluster.

    Replication Direction

    XDCR can occur between source and destination clusters in either of the following ways:

    Unidirectional

    The data contained in a specified source bucket is replicated to a specified destination bucket. Although the replicated data on the destination cluster could be used for the routine serving of data, it is in fact intended principally as a backup, to support disaster recovery.

    Bidirectional

    The data contained in a specified source bucket is replicated to a specified destination bucket; and the data contained in the destination bucket is, in turn, replicated back to the source bucket. This allows both buckets to be used for the serving of data, which may provide faster data-access for users and applications in remote geographies.

    Technically, XDCR only performs unidirectional replication. A bidirectional topology is created by implementing two unidirectional replications, in opposite directions, between two clusters; such that a bucket on each cluster functions as both source and destination.

    When creating a replication, you will specify whether to make the replication bidirectional. If left unspecified, the replication will be unidirectional from the source bucket to the destination bucket, and will appear under the Replications tab of the source cluster. If the replication is configured to be bidirectional, the replication will appear under the Replications tab of both the source cluster and the destination cluster.

    Learn more about XDCR direction and topology in the Couchbase Server documentation.

    Replication Filtering

    XDCR filtering allows specified subsets of documents to be replicated from the source bucket. A document can be included in, or excluded from, a filtered replication, based on the document’s fields and values.

    When a replication starts, the cluster examines the specified source bucket, and determines which documents to replicate:

    • If XDCR filtering is not applied, each document in the source bucket is replicated to the target.

    • If XDCR filtering is applied, each document in the source bucket is examined; but only those documents that meet the specified filtering-criteria are replicated.

    XDCR filters are normally configured when creating a replication, but can also be specified when modifying an existing replication.

    Supported Filters

    Filter match-requirements are specified by means of:

    • Regular Expressions. These can be used to specify case-sensitive character-matches, and thereby determine whether a field-name or value may entitle a document to be included in a replication.

    • Filtering Expressions. These allow comparisons and calculations to be made on the fields and values identified by means of regular expressions: based on the results, a document either is or is not included in a replication.

    Learn more about XDCR expressions in the Couchbase Server documentation.

    Conflict Resolution

    In some cases, especially when bidirectionally replicated data is being modified by applications in different locations, conflicts may arise: meaning that the data of one or more documents has been differently modified more or less simultaneously, requiring resolution. XDCR provides two types of conflict resolution, based on either sequence number or timestamp, whereby conflicted data can be saved consistently on source and target.

    Sequence Number

    Conflicts can be resolved by referring to documents' sequence numbers. Sequence numbers are maintained per document, and are incremented on every document-update. The sequence numbers of source and target documents are compared; and the document with the higher sequence number prevails.

    Timestamp

    Timestamp-based conflict resolution uses the document timestamp (stored in the CAS) to resolve conflicts. The timestamps associated with the most recent updates of source and target documents are compared. The document whose updates have the more recent timestamp prevails.

    The type of conflict resolution that is used for a given replication is determined by the conflict resolution policy that is configured on the source and destination buckets. Conflict resolution is configured on a per-bucket basis at bucket creation time, and cannot be changed later.

    Learn more about XDCR conflict resolution in the Couchbase Server documentation.

    Administrating Replications

    For information on how to create, manage, and delete replications, refer to Manage Replications.