Inter-Sync Gateway Replication


    Description — Use inter-Sync Gateway replication to keep clusters in different mobile data centers in sync.
    Abstract — Inter-Sync Gateway replication supports resilient, secure, scalable bidirectional synchronization of data between cloud and edge data centers. It incorporates content presented under Manage>Replicate>Inter-Sync Gateway to give an integrated view.
    Related Content — Configuration Properties | Admin REST API | Initialize Inter-Sync Gateway Replications

    Context Clarification

    This content relates only to inter-Sync Gateway replication in Sync Gateway 2.8+. For documentation on pre-2.8 inter-Sync Gateway replication (also known as SG Replicate) — see SG-Replicate


    Couchbase Sync Gateway’s Inter-Sync Gateway Replicationglossary icon feature supports cloud-to-edgeglossary icon synchronization use cases, where data changes must be synchronized between a centralized cloud cluster and a large number of edge clusters whilst still enforcing fine grained access control. This is an increasingly important enterprise-level requirement.

    In the architecture diagram (Figure 1), the replicator on the active Sync Gateway node ensures that any database changes made to documents in either Sync Gateway databaseglossary icon instance are replicated to the other Sync Gateway instance, in accordance with the replication’s configuration — see replications for configuration details.

    icr replication overview
    Figure 1. Inter-Sync Gateway Replication architecture

    Use Cases

    Cloud-to-edge synchronization

    In this multi-cloud deployment mode large numbers of multiple edge clusters sync with one or more clusters in cloud data centers. Each edge can operate autonomously without network connectivity to the cloud data centers [1].

    A typical architecture for this use cases is shown in Figure 2

    icr cloud to edge200712
    Figure 2. Cloud-to-edge synchronization

    Active-to-active Mobile Synchronization

    Here edge clusters containing active Sync Gateway nodes are replicated between geographically separate cloud-based Sync Gateway deployments. An ideal use-case for inter-Sync Gateway replication, which was designed to keep clusters in different data centers in sync.

    The Couchbase Server Cross Data Center Replication API (XDCR) is similarly used to replicate between Couchbase Server clusters. However, the inter-Sync Gateway replication feature is specifically designed for Sync Gateway deployments and it must be used for replication between mobile clusters. [2]

    A typical architecture for this use case is shown in Figure 3

    icr active mobile sync200713
    Figure 3. Active-to-active mobile synchronization

    Replication Characteristics


    Sync Gateway supports the ability to run replications between Sync Gateway clusters using a new websockets based protocol, with the active replicatorglossary icon synchronizingglossary icon changes between two Sync Gateway databasesglossary icon.

    All replications are based on a replication definitionglossary icon provided either to the Admin REST API or in Sync Gateway’s configuration file (JSON).

    Replications always involve at least one local database. Sync Gateway does not enable replication between two remote nodes, because replications are defined at database level and so at least one database will be local.

    All replications take place at the document level (but see also, Delta Sync).

    Sync Gateway nodes can opt-out of participating in the replication process using the database-level parameter sgreplicate_enabled.

    Related configuration elements: databases | replications | remote | sgreplicate_enabled


    Inter-Sync Gateway replications are based on websockets. This is the exact same protocol that is used for replication with Couchbase Lite 2.x clients.

    For users on releases prior to 2.8, SG Replicate provides a HTTP-based replication — see the appropriate documentation here — SG-Replicate (deprecated).

    The bi-directional, persistent, nature of websocket connections is ideal for applications such as a continuous Sync Gateway replication, which is constantly waiting-for and synchronizing change events.

    Types of Replication

    Replications are either: adhoc replicationglossary icon (REST API only) or persistent replicationglossary icon. They can also be configured to run in one of two-modes: continuousglossary icon or one-shotglossary icon.

    • Persistent

      Persistent replications survive Sync Gateway node restarts and continue running automatically unless configured not to. They can be configured to run in either continuousglossary icon or one-shotglossary icon mode.

    • Ad hoc

      Ad hoc replications are transient, existing only for the period of the replication. They provide a convenient way to:

      • Run one off replications (for example, when troubleshooting)

      • Run on-demand replications after Sync Gateway is started. For instance a replication that needs to be to be run only periodically can be configured as an ad hoc replication by an automated script scheduled to run when needed.

    Related configuration elements: replications | continuous | adhoc

    Delta Sync

    This content relates only to ENTERPRISE EDITION

    With delta-sync enabled on the replication and both databases involved, only the changed data items are transferred.

    You can configure replications to use delta-sync by:

    • Setting "enable_delta_sync": true in the replication definition

    • Setting "delta-sync": { "enabled": true} on both databases in their respective database definitions.

    Push replications to pre-2.8 targets do not use Delta Sync

    Related configuration elements: databases | replications | this_db.delta_sync | enable_delta_sync


    Replications are bi-directional. You can push, pull or push and pull between the two database endpoints.

    Related configuration elements: replications | direction


    Transport level security is provided for. Use the appropriate prefix in URL (WSS for websockets).


    Support for Basic Authentication using username and password credentials is provided

    Access Control

    Data access control is provided by Sync Gateway’s sync functionglossary icon and the username/password credentials. All replicated documents pass through this function ensuring that access permissions are adhered to.

    Related configuration elements: databases | sync
    Related how-to: Sync Function | Sync Function

    Network Resilience

    Inter-Sync Gateway replications will automatically attempt to restart whenever the node restarts.

    Network resiliency is built-in for continuous replications. They respond to network issues such as lost connections, by applying a persistent exponential backoffglossary icon policy to attempt reconnection.

    The max_backoff_time determines the maximum wait time between retries. When the limit is reached retries are made every max_backoff_time minutes. Set "max_backoff_time": 0 to prevent indefinite retries. Exponential backoff retries will be attempted for up to 5 minutes and then stop if the connection has not been re-established

    Related replication definition elements: max_backoff_time

    High Availability


    This content relates only to ENTERPRISE EDITION
    • Enterprise

    • Community

    Inter-Sync Gateway Replication provides built-in High Availability (HA) support. It uses node distribution to ensure all running replications are uniformly distributed across all available nodes, regardless of their originating node.

    A replication runs on only one node at any given time. When a node fails, the system automatically distributes that node’s replications across any available alternative nodes (providing the replication has been configured on multiple nodes).

    To use high-availability, configure the same replication on at least two Sync Gateway nodes.

    Even though automatic node-distribution is not available in COMMUNITY EDITION, you can make your replications more highly-available.

    Simply define the same replication on multiple nodes. They will then run on each of those nodes.

    This redundancy provides some resiliency if a node fails. As, although no automatic distribution of replications is done,if the replication is running on multiple nodes then it will continue running on any surviving nodes.

    Node Distribution

    The goal of node distribution is to maintain an optimal balance of replications across the cluster.,with any given replication runs on only one node at any give time.

    To achieve this Sync Gateway automatically balances, as equally as possible, the number of replications running on each node.

    Where multiple replications are configured on multiple nodes, Sync Gateway automatically distributes these replications across all the available nodes for which the replications are configured. It continually monitors and redistributes replications as the number of available nodes and the number of running replications in a cluster changes.

    The nodes' processing load and bandwidth usage is minimized by ensuring that a replicator runs on only one node at any given time — even where it has been configured to run on multiple nodes. This avoids the redundant exchange of data arising from duplicate replication.

    Configuration Requirements

    To configure a replication to be highly available, include its database and replication definition in the sync gateway configuration on each node in the cluster that you want to be able to run it. At least two nodes are required.

    Node distribution will automatically elect an appropriate node to run them on and take care of redistributing them if a node fails.

    Related configuration elements`: Configuration Properties | Admin REST API

    Expected Failure Behavior

    If a node fails, then Sync Gateway will take any replications configured on multiple nodes, redistribute them across all available remaining nodes and restart them. Node distribution will continually seek to maintain an optimal distribution of replications across available nodes.

    Examples of Expected behavior

    This section provides examples of expected behavior in differing scenarios. It provides a comparison of how behavior differs between ENTERPRISE EDITION and COMMUNITY EDITION.

    The following scenarios are covered, each involves a sync gateway cluster with multiple nodes:

    • Homogenous configuration — see Example 1

    • Homogenous configuration with non-replicating node — see Example 2

    • Heterogenous configuration — see Example 3

    • Adding more nodes — see Example 4

    • Failing node — see Example 5

    Example 1. Homogenous configuration
    • The cluster comprises three Sync Gateway nodes

    • The same sync gateway configuration is applied across all nodes

    • All nodes are configured to run Replication Id 1

    • Sync Gateway automatically designates one of the three nodes to run Replication Id 1.

    • If a node goes down, Sync Gateway elects one of the remaining nodes to continue Replication Id 1.

    Sync Gateway runs Replication Id 1 on all nodes in the cluster.

    Example 2. Homogenous configuration with non-replicating node
    • The cluster comprises three Sync Gateway nodes.

    • Each node has the same sync gateway configuration, with one exception. The configuration on Node 3 has opted out of replication (sgreplicate_enabled=false)

    • All Three nodes are configured to run Replication Id 1.

    • Sync Gateway automatically designates either Node 1 or Node 2 to run the Replication Id 1.

    • If either Node 1 or Node 2 fails, Sync Gateway elects the non-failing node.

    • Sync Gateway runs Replication Id 1 on all nodes in cluster.

    • The system ignores the opt-out flag (sgreplicate_enabled).

    Example 3. Heterogenous configuration
    • The cluster comprises three Sync Gateway nodes

    • Both Node 1 and Node 2 are configured to run Replication Id 1

    • Node 3 is configured to run Replication Id 2 but not Replication Id 1

    • Sync Gateway automatically distributes Replication Id 1 and Replication Id 2 so that each runs on one of Node 1, Node 2 or Node 3, with no node running both replications simultaneously.

    • If any node fails whilst running either replication, Sync Gateway elects a non-failing node to continue that replication on. Where two nodes remain the node not running a replication will be chosen.

    • Sync Gateway runs Replication Id1 on Node 1 and Node 2 in the cluster

    • Sync Gateway runs Replication Id 2 on Node 3


    • If Node 3 fails, then Replication Id 2 will not be continued on either of the remaining nodes as it is not configured on them

    • Similarly, if either or both of the other nodes (Node 1 and Node 2) fails, Node 3 will not be a candidate to run the corresponding replication.

    Example 4. Adding more nodes
    • The cluster comprises a single Sync Gateway node

    • Node 1 is configured to run Replication Id 1 and Replication Id 2

    • LATER . . . Node 2 is added to the cluster to run Replication Id 1 and Replication Id 2.

    • Sync Gateway designates Node 1 run both Replication Id 1 and Replication Id 2

    • LATER . . . when Node 2 is added . . .

      • Sync Gateway select one of the Node 1 replications to run on Node 2; let’s say it chooses Replication Id 2

      • Sync Gateway stops Replication Id 2 on Node 1

      • Sync Gateway starts Replication Id 2 on Node 2.

    • Sync Gateway designates Node 1 to run both Replication Id 1 and Replication Id 2

    • WHEN . . . Node 2 is added . . . Sync Gateway designates it to run both Replication Id 1 and Replication Id 2

    Example 5. Failing node
    • The cluster comprises three Sync Gateway nodes with a homogeneous configuration

    • All three nodes are configured to run Replication Id 1, Replication Id 2 and Replication Id 3

    • LATER . . . Node 3 goes down

    Sync Gateway automatically distributes the replications, one to each of the nodes

    • Lets assume the following distribution:

      • Node 1 runs Replication Id 1

      • Node 2 runs Replication Id 2

      • Node 3 runs Replication Id 3

    WHEN . . . Node 3 goes down . . . Sync Gateway elects either Node 1 or Node 2 to continue running Replication Id 3

    Sync Gateway runs all three replications (Replication Id 1 , Replication Id 2 and Replication Id 3) on all three nodes in the cluster (Node 1, Node 2 and Node 3)

    WHEN . . . Node 3 goes down . . . Node 1 and Node 2 continue to run Replication Id 1, Replication Id 2 and Replication Id 3

    Monitoring Node Distribution

    Use the _replicationStatus endpoint to access information about which replications are running on which nodes — see: _replicationStatus(replicationID) | _replicationStatus(replicationID)?action={action}

    This information is also collected and available in the log files.

    Conflict Resolution

    Inter-Sync Gateway replication supports automatic conflict resolution to resolve conflicting document changes. It delivers this by applying one of its built-in conflict resolver policiesglossary icon, which can be easily included in your own replications.

    The goal of automatic conflict resoluton is to return a winning revision based on the consistent application of the configured conflict resolver policyglossary icon.

    The default conflict resolver policy is to always returns a winner determined by the automatic conflict resolution policyglossary icon.

    For ENTERPRISE EDITION, a custom conflict resolver policy is available, providing additional flexibility by allowing users to provide their own conflict resolution logic — see: Custom Conflict Resolution Policy.

    How Resolution Works

    Conflict Response on Active Replicator

    As soon as the active Sync Gateway database detects a conflict in a replicated document revision, it initiates its configured conflict resolver policy to determine a winning revision. This policy assesses the conflicting revisions and either determines the winning revision or returns an error if it fails while doing so.

    Conflict Response by Passive Replicator

    When a passive Sync Gateway database detects a conflict it responds to the active with a 409 response and rejects the revision. It is expected that the active Sync Gateway will pull the conflicting revision from the passive, resolve it on the active, and then subsequently push the resolved conflict back up.

    Pull Replications

    For Pull replications the active Sync Gateway is responsible for detecting and resolving conflicts based on the configured conflict_resolution_type  — see configuration item: conflict_resolution_type.

    This is also how conflicts are handled when Couchbase Lite clients pull down documents to Sync Gateway.

    Note: Resolved conflicts are only transferred from active to passive Sync Gateways if a replication is setup between them.

    Push Replications
    Passive Sync Gateway

    The passive Sync Gateway will automatically detect and reject conflicting revisions being pushed to it.

    Note that conflicts are not resolved. The revision is rejected and the document returned — with a 409 Conflict — response to the active Sync Gateway.

    Active Sync Gateway

    It is the responsibility of the active sync Gateway to address rejected revisions in accordance with its specified conflict_resolution_type.

    This approach is the same as that adopted when Couchbase Lite clients push documents to Sync Gateway.

    For more on conflict resolution — see: Conflict Resolution

    1. This architecture is also known as ship-to-shore or hub-and-spoke.
    2. If one-directional XDCR is used alongside Sync Gateway, then Sync Gateway cluster must be in read-only mode.