XDCR Conflict Resolution

XDCR Conflict Resolution automatically synchronizes document-copies that have been modified in different ways at different locations.

Conflicts and Their Resolution

A conflict is caused when the source and target copies of an XDCR-replicated document are updated independently of and dissimilarly to one another, each by a local application. The conflict must be resolved, by determining which of the variants should prevail; and then correspondingly saving both documents in identical form. XDCR provides an automated conflict resolution process.

Two, alternative conflict resolution policies are supported: revision-based (which is the default), and timestamp-based. Note that timestamp-based conflict resolution is only available in the Enterprise Edition of Couchbase Server.

The Conflict Resolution Process

When a source document is modified, XDCR determines whether this revision of the document should be applied to the target. For documents above 256 bytes in size, XDCR fetches metadata from the target cluster before replicating. The target metadata for the document is compared with the source metadata for the document, in order to choose which document should prevail (the exact subset of metadata used in this comparison depends on the source bucket’s conflict resolution policy). If the source document prevails, it is replicated to the target; if the target document prevails, the source document is not replicated.

Once a replicated document reaches the target, the target cluster also performs a metadata comparison as described, in order to confirm that the document from the source cluster should indeed prevail. If this is confirmed, the document from the source cluster is applied to the target cluster, and the target cluster’s previous version of the document is discarded.

As a performance optimization, XDCR makes no metadata comparison on the source for documents of 256 bytes or less, thus making unnecessary a metadata fetch from the target cluster: instead, the document is replicated immediately to the target, and metadata comparison is performed there.

If a document is deleted on the source, XDCR makes no metadata comparison on the source before replication.

Once configured, conflict resolution is a fully automated process, requiring no manual intervention.

Revision-Based Conflict Resolution

Revision-based conflict resolution uses the document’s revision ID to resolve conflicts. Revision IDs are maintained per document, and are incremented on every document-update. Revision-based conflict resolution compares the revision IDs of the source and target documents. The document with more revisions prevails.

If both document-versions have the same revision ID, the conflict is resolved by comparing the following metadata-elements, in the order shown:

  1. CAS value

  2. Expiration (TTL) value

  3. Document flags

Timestamp-Based Conflict Resolution

Timestamp-based conflict resolution uses the document timestamp (stored in the CAS) to resolve conflicts. The timestamps associated with the most recent updates of source and target documents are compared. The document whose updates have the more recent timestamp prevails.

If both document-versions have the same timestamp-value, the conflict is resolved by comparing the following metadata-elements, in the order shown:

  1. Revision ID

  2. Expiration (TTL) value

  3. Document flags

Time Synchronization

Timestamp-based conflict resolution requires the use of synchronized clocks across all nodes, in all clusters intended to participate in XDCR. If clocks are not so synchronized, conflict resolution may produce unexpected results. To achieve synchronicity, an external entity such as NTP (Network Time Protocol) is required. For information, see Clock Sync with NTP.

Even with optimal clock synchronicity, small differences may persist between the clock-settings on different nodes and clusters: this is known as time skew. Time skew between clusters should be closely monitored, to ensure that timestamp-based conflict resolution produces the intended results. For more details, see Monitoring XDCR Timestamp-based Conflict Resolution.

To compensate for time skew, Couchbase Server records timestamps using a Hybrid Logical Clock (HLC). This is a combination of a physical and a logical clock: the physical clock is the time returned by the system, in nanoseconds; the logical clock is a counter, which is incremented when the physical clock yields a value either smaller than or equal to the currently stored, physical clock-value. The HLC:

  • Is monotonic through its use of a logical clock; and therefore does not suffer from the potential leap-back of a purely physical clock.

  • Captures the ordering of mutations.

  • Is close to physical time.

The CAS of a document is used to store the HLC timestamp. It is a 64-bit value, with the most significant 48 bits representing the physical clock, and the least significant 16 bits representing the logical clock. Each mutation has its own HLC timestamp.

Ensuring Safe Failover

When failover (say, from data center A to data center B) is required, timestamp-based conflict resolution requires that applications redirect traffic to data center B only after the greater of the following two time-periods has elapsed:

  • The replication latency between data centers A and B. This provides sufficient time for any in-flight mutations to be received by data center B prior to traffic redirection.

  • The absolute time skew between data centers A and B. This ensures that writes to data center B commence only after the last write to data center A.

When availability is restored to data center A, applications must wait for the same time period to elapse, before again redirecting their traffic.

Choosing a Conflict Resolution Policy

Conflict resolution policy is configured on a per-bucket basis at bucket creation time, it cannot be changed later. For more information, see Create a Bucket. Choosing a conflict resolution method requires consideration of the logic of the applications that require the data. This is illustrated by the following examples:

  • Most Updates Wins: A hit-counter, for a website, is stored as a document within Couchbase Server; a value within the document is incremented each time the website is accessed. In the event of conflict, the document-version that contains the higher count is the more useful, since more closely reflective of the actual count. Therefore, in this instance, revision-based conflict resolution is preferable, since it ensure that the more mutated document prevails.

  • Most Recently Updated Wins: A thermometer device store the current temperature as a document within Couchbase Server, writing new values continuously to the same key. In the event of conflict, the document-version more recently updated is the more useful, since more closely reflective of the current temperature. Therefore, in this instance, timestamp-based conflict resolution ensures that the more recent version of the document prevails.

Aligning Source and Target Policies

XDCR replications cannot be created between buckets with different conflict resolution policies: source and target buckets must always be configured with the same policy.

When using XDCR with a source cluster running a pre-4.6 version of Couchbase Server, only revision-based conflict resolution can be used.