Version Vectors

      +

      Description — Couchbase Lite 4.0 — Version Vectors — Document versioning and conflict resolution
      Related Content — Databases | Documents | Handling Data Conflicts | Data Sync

      Overview

      Couchbase Lite 4.0 introduces version vectors as a replacement for the previous revision tree system used in earlier versions. This change improves how Couchbase Lite tracks document changes, handles conflicts, and synchronizes data across devices and with Sync Gateway.

      Version vectors provide a more efficient and scalable approach to document versioning that aligns Couchbase Lite with Couchbase Server’s versioning system, enabling seamless synchronization across the entire Couchbase ecosystem.

      What are Version Vectors?

      A version vector is a data structure that tracks the complete history of document modifications across different sources. Instead of maintaining a tree-like structure of document revisions, version vectors use a more efficient approach based on logical timestamps.

      Key Components

      Source ID

      A unique identifier for each Couchbase Lite database instance that can modify documents. Each Couchbase Lite database on a device receives its own unique source ID, ensuring that document changes tracks back to their originating database.

      Timestamp

      A logical clock value that establishes the ordering of document changes within a single source. Couchbase Lite 4.0 uses Hybrid Logical Clocks (HLC), which combine real-time with logical counters to verify proper tracking.

      Version

      A combination of timestamp and source ID that uniquely identifies the specific point in time and location where the development process creates a particular document revision. This approach replaces traditional revision ID concepts.

      Version Vector

      An ordered array containing the latest version from every source that’s modified the document.

      Version Vectors vs. Revision Trees

      Version vectors represents an improvement over revision tree system used in Couchbase Lite 3.x and earlier versions. While the revision tree approach maintained complex branching trees of all document revisions with revision IDs in the format <generation>-<document-hash>, version vectors use a more efficient structure that tracks only the latest version from each source using timestamp-based identifiers in the format <timestamp>@<source-id>.

      This change eliminates the storage overhead of maintaining complete revision history trees and replaces the "most active wins" conflict resolution logic with "last write wins" approach based on hybrid logical timestamps.

      As a result, version vectors reduce storage requirements and simplify synchronization through vector comparison rather than tree merging operations. It also improves overall performance and scalability as the number of connected devices increases.

      Benefits of Version Vectors

      Improved Performance

      Version vectors require less storage space and processing power compared to maintaining complete revision trees.

      Better Scalability

      The system scales more efficiently as the number of connected devices increases.

      Simplified Conflict Resolution

      Last-write-wins logic based on timestamps is more predictable and easier to understand.

      Enhanced Synchronization

      Alignment with Couchbase Server’s versioning enables more efficient sync operations.

      Reduced Complexity

      Eliminates the need to manage complex tree structures and revision genealogies.

      Impact on Document Identification

      The transition to version vectors transforms how documents receive identification and referencing within Java applications.

      Revision ID Format Changes

      CBL 3.x Format

      1-7bf9c5c9d5e2c7a5d8f0e3c6a9d2f4b7

      CBL 4.0 Format

      1773b25174850000@4a7c8e5f-2d3b-4f9e-8c1a-6b4d9e2f7a5c

      The new format contains:

      • Timestamp portion: 1773b25174850000 (hybrid logical clock value).

      • Source ID portion: 4a7c8e5f-2d3b-4f9e-8c1a-6b4d9e2f7a5c (UUID).

      Document API Changes

      The Document.getRevisionID() method continues to work but now returns version-based IDs. Additionally, a new timestamp property provides direct access to the document’s logical timestamp:

      Example 1. Accessing Document Version Information
      // Existing revision ID access (now returns version format)
      String revisionId = document.getRevisionID();
      
      // New timestamp property
      long timestamp = document.getTimestamp();

      The timestamp value returns long representing nanoseconds since the Unix epoch (January 1, 1970 00:00:00 UTC). A timestamp value of zero indicates no timestamp is available.

      Impact on Conflict Resolution

      Version vectors change how Couchbase Lite resolves conflicts during synchronization.

      Previous Conflict Resolution (CBL 3.x)

      The revision tree system used "most active wins" logic:

      • Conflict resolves by comparing revision generation numbers.

      • The document with the highest generation number (most edits) would win.

      • This could lead to scenarios where older documents with more edits would override newer documents with fewer edits.

      New Conflict Resolution (CBL 4.0)

      Version vectors implement "last write wins" conflict resolution:

      Example 2. Default Conflict Resolution Logic
      public Document resolve(Conflict conflict) {
          if (conflict.getRemoteDocument() == null || conflict.getLocalDocument() == null) {
              return null; // Deleted revision always wins
          } else if (conflict.getLocalDocument().getTimestamp() > conflict.getRemoteDocument().getTimestamp()) {
              return conflict.getLocalDocument();
          } else {
              return conflict.getRemoteDocument();
          }
      }

      This approach:

      • Compares hybrid logical timestamps to determine which revision received the most recent write operation.

      • Provides more intuitive conflict resolution behavior.

      • Ensures that the most recent change (by wall-clock time) typically wins.

      • Reduces unexpected conflict resolution outcomes.

      Custom Conflict Resolution

      While the default resolver changes, you can still implement custom conflict resolution logic. The new timestamp property provides additional context for making resolution decisions.

      Example 3. Custom Conflict Resolution Example
      public Document customResolve(Conflict conflict) {
          Document local = conflict.getLocalDocument();
          Document remote = conflict.getRemoteDocument();
      
          if (local == null || remote == null) {
              return null;
          }
      
          // Use timestamp along with other business logic
          if (local.getTimestamp() > remote.getTimestamp()) {
              // Local is newer, but check business rules
              return applyBusinessRules(local, remote);
          } else {
              return applyBusinessRules(remote, local);
          }
      }

      Compatibility

      Couchbase Lite 4.0 provides backward compatibility for existing databases:

      Automatic Upgrade

      When opening a CBL 3.1 or 3.2 database with CBL 4.0, documents are automatically upgraded to use version vectors.

      Lazy Migration

      The upgrade occurs incrementally as documents are accessed and modified.

      No Downgrade

      Version vector upgrades prevent CBL 3.x versions from opening these databases.

      Synchronization Compatibility

      Version vector synchronization has specific requirements:

      Sync Gateway Compatibility

      CBL 4.0 requires Sync Gateway 4.0 or later. Attempting to sync with older Sync Gateway versions results in an error.

      Peer-to-Peer Compatibility

      CBL 4.0 can only perform peer-to-peer sync with other CBL 4.0+ instances. Sync attempts with CBL 3.x peers fails with an appropriate error message.

      Development Considerations

      Testing Applications

      When testing applications with version vectors, be aware that:

      • Non-deterministic IDs: Version-based revision IDs resist advance calculation due to timestamp components.

      • Test Assertions: Update test cases to verify revision ID existence and ordering rather than specific values.

      • Conflict Testing: Verify that conflict resolution now uses timestamp-based logic.

      Example 4. Testing Document Revisions
      // Instead of testing specific revision ID values
      // assertEquals("1-abc123", doc.getRevisionID());
      
      // Test for presence and format
      assertNotNull(doc.getRevisionID());
      assertTrue(doc.getRevisionID().contains("@"));
      
      // Test timestamp ordering
      Document updatedDoc = // ... update document
      assertTrue(updatedDoc.getTimestamp() > doc.getTimestamp());