Couchbase Distributed ACID Transactions for Java SDK Release Notes

    +

    Couchbase Distributed ACID Transactions is distributed as a separate library for the Java SDK. This page features the release notes for that library — for release notes, download links, and installation methods for the latest 3.x Java SDK releases, see the current Release Notes page.

    Using Distributed Transactions

    See the Distributed ACID Transactions concept doc in the server documentation for details of how Couchbase implements transactions. The Distributed Transactions HOWTO doc walks you through all aspects of working with Distributed Transactions.

    Distributed Transactions Java 1.1.0 (12 August 2020)

    This is the third release of Couchbase Distributed ACID Transactions for Java. It includes 24 bug-fixes and improvements (some internal improvements are not listed below).

    It is built on Couchbase java-client version 3.0.7, and requires Couchbase Server 6.6 or above. Note this is a change from the previous requirement for Couchbase Server 6.5.

    Upgrading

    All existing users are strongly recommended to read this section.

    Couchbase Server 6.6 adds crucial functionality for transactions. This functionality is so important to our isolation guarantees that we are making Couchbase Server 6.6 the new minimum supported version for Couchbase transactions, as of this release.

    In general we work hard to ensure that Couchbase software, including transactions, works seamlessly across all currently supported Couchbase Server releases, and that transactional releases are fully interopable with each other. For technical reasons, we were unable to achieve that goal for this release, and there are two compatibility rules that the user must understand in order to safely upgrade.

    These are the two compatibility rules:

    • This release requires Couchbase Server 6.6, it cannot run on 6.5.

      • Transactions that try to perform an insert will fail with TransactionFailed.

    • Previous releases are not fully forward-compatible with this release. This release must not be run concurrently with a previous release.

      • In fact, the previous release must not be run at all, after this release has been run once.

    The suggested upgrade path is:

    • Upgrade the cluster fully from Couchbase Server 6.5 to 6.6. This can be a live or offline upgrade.

    • Fully bring down the application running the previous release. Make sure no such application instances are running.

    • Bring up the application running the new release.

    • Make sure to never bring up an application running the previous release.

    We have added forwards-compatibility features to the transactions protocol and software, with the intent that in the future, upgrading will be as seamless and simple as you expect from Couchbase.

    Headline Improvements

    • TXNJ-125: Our transactions stages changes to a document alongside that document. In previous releases and Couchbase Server 6.5, this required creating an empty document to store inserts. Whilst these were not visible pre-commit to many parts of the Couchbase Data Platform, they were visible in some places — including the UI. Couchbase Server 6.6 brings new functionality that allows these inserts to be fully committed, providing full Read Committed isolation for all mutations.

    Write-Write Conflict Improvements

    When a transaction (T1) tries to mutate a document that’s in another transaction (T2), this is a write-write conflict.

    • TXNJ-246: Write-write conflict has been tuned to backoff-and-retry much less aggressively, which will result in much less 'churn' and overhead.

    • TXNJ-86: T1 previously had to backoff and retry until either T2 completed, or cleanup discovered it (with default settings, that can take up to a minute). Now, T1 can proceed to overwrite the document if it finds T2 has expired or completed.

    • TXNJ-217: Performance improvement — T1 now only fetches the T2-specific data from T2’s Active Transaction Record (ATR), rather than the full document.

    Cleanup Improvements

    Creating the Transactions object creates a background thread performing cleanup, which scans for expired transactions created by any application on any bucket.

    • TXNJ-208: If cleanup is still unable to cleanup a transaction after two hours, it will begin logging that failure at WARN level. If there is not a reasonable explanation for cleanup continuing to fail (such as a downed node that has not been failed-over), then users are encouraged to send these logs to Couchbase support.

    • TXNJ-13: Performance improvement — cleanup has been tuned to remove unnecessary writes.

    • TXNJ-108: Cancel cleanup if a pending transaction is discovered to now be committed. This should not occur, and is simply a safety-check.

    • TXNJ-229: Cleanup already checked that documents are in the expected state before cleaning them up, and now also uses CAS to verify that this is still the case when it gets to the cleanup point.

    • TXNJ-240: Cleanup of documents is now done with durable writes.

    • TXNJ-232: The expiry field inside an attempt now reflects time elapsed inside the transaction.

    API Impacting

    • TXNJ-227: A runtime check is now performed to ensure the minimum required java-client version is used. This will cause the application to fast-fail at the point of creating the Transactions object, rather than failing with a less clear error when creating transactions.

    • TXNJ-219: The TransactionConfigBuilder documentation stated that underlying Key-Value reads and writes would use kvTimeout and kvDurableTimeout from the configured environment if one is not specified. That was not being done, and now is.

    Bug Fixes

    • TXNJ-237: A Scala bug with Class.getSimpleName meant the library could not be called from Scala.

    • TXNJ-250: MAV reads should return post-transaction content on COMPLETED, not just COMMITTED.

    Reliability Improvements

    • TXNJ-249: The documentation advises applications to not catch exceptions inside the lambda (and if they have to, to propagate them). This advice remains, but if it is not followed, now subsequent operations (including commit) will fail. This means we no longer rely on the application propagating errors to ensure that transactions work correctly.

    • TXNJ-231: Rollback previously assumed that documents had not changed between them being staged and rolled back. That should be the case, but now CAS is used to confirm it.

    Distributed Transactions Java 1.0.1 (19 June 2020)

    This is the second release of Couchbase Distributed ACID Transactions.

    It is built on Couchbase java-client version 3.0.5, and requires Couchbase Server 6.5 or above.

    It is strongly recommended that all users read the "API Changes" section, as they may desire to make application changes in order to use this newly exposed information.

    API Changes

    TransactionCommitAmbiguous

    Summary: A TransactionCommitAmbiguous exception has been added, and applications may wish to handle it (TXNJ-205).

    Ambiguity is inevitable in any system, especially a distributed one.

    Each transaction has a 'single point of truth' that is updated atomically to reflect whether it is committed.

    However, it is not always possible for the protocol to become 100% certain that the operation was successful, before the transaction expires. This is important as the transaction may or may not have successfully committed, and in the previous release this was not reported to the application.

    The library will now raise a new error, TransactionCommitAmbiguous, to indicate this state. TransactionCommitAmbiguous derives from the existing TransactionFailed that applications will already be handling.

    On TransactionCommitAmbiguous, the transaction may or may not have reached the commit point. If it has reached it, then the transaction will be fully completed ("unstaged") by the asynchronous cleanup process at some point in the future. With default settings this will usually be within a minute, but whatever underlying fault has caused the TransactionCommitAmbiguous may lead to it taking longer.

    Handling: This error can be challenging for an application to handle. One simple approach is to retry the transaction at some point in the future. Alternatively, if transaction completion time is not a priority, then transaction expiration times can be extended across the board through TransactionConfigBuilder. This will allow the protocol more time to resolve the ambiguity, if possible.

    TransactionResult.unstagingComplete()

    Summary: Failures during the post-commit unstaging phase can now result in the transaction successfully returning. A new TransactionResult.unstagingComplete() method has been added (TXNJ-209).

    As above, there is a 'single point of truth' for a transaction. After this atomic commit point is reached, the documents themselves still need to be committed (we also call this "unstaging"). However, transactionally-aware actors will now be returning the post-transaction versions of the documents, and the transaction is effectively fully committed to those actors.

    So if the application is solely working with transaction-aware actors, then the unstaging process is optional. The previous release would retry any failures during the unstaging process until the transaction expires, but this unnecessarily penalises applications that do not require unstaging to be completed.

    So, many errors during unstaging will now cause the transaction to immediately return success. The asynchronous cleanup process will still complete the unstaging process at a later point.

    A new unstagingComplete() call is added to TransactionResult to indicate whether the unstaging process completed successfully or not. This should be used any time that the application needs all results of the transaction to be immediately available to non-transactional actors (which currently includes N1QL and non-transactional Key-Value reads).

    Putting two API changes above together, the recommendation for error handling of a typical transaction is now this:

    // Imports required by this sample
    import com.couchbase.client.java.kv.GetResult;
    import com.couchbase.transactions.TransactionResult;
    import com.couchbase.transactions.Transactions;
    import com.couchbase.transactions.error.TransactionCommitAmbiguous;
    import com.couchbase.transactions.error.TransactionFailed;
    
    import static com.couchbase.client.java.query.QueryOptions.queryOptions;
    try {
        TransactionResult result = transactions.run((ctx) -> {
            // ... transactional code here ...
        });
    
        // The transaction definitely reached the commit point. Unstaging
        // the individual documents may or may not have completed
    
        if (result.unstagingComplete()) {
            // Operations with non-transactional actors will want
            // unstagingComplete() to be true.
            cluster.query(" ... N1QL ... ",
                    queryOptions()
                            .consistentWith(result.mutationState()));
    
            String documentKey = "a document key involved in the transaction";
            GetResult getResult = collection.get(documentKey);
        }
        else {
            // This step is completely application-dependent.  It may
            // need to throw its own exception, if it is crucial that
            // result.unstagingComplete() is true at this point.
            // (Recall that the asynchronous cleanup process will
            // complete the unstaging later on).
        }
    }
    catch (TransactionCommitAmbiguous err) {
        // The transaction may or may not have reached commit point
        System.err.println("Transaction returned TransactionCommitAmbiguous and" +
                " may have succeeded, logs:");
    
        // Of course, the application will want to use its own logging rather
        // than System.err
        err.result().log().logs().forEach(log -> System.err.println(log.toString()));
    }
    catch (TransactionFailed err) {
        // The transaction definitely did not reach commit point
        System.err.println("Transaction failed with TransactionFailed, logs:");
        err.result().log().logs().forEach(log -> System.err.println(log.toString()));
    }

    Bug Fixes and Stability

    • TXNJ-184: If doc is found to be removed while committing it, create it.

    • TXNJ-187: Errors raised while setting ATR state to Committed are now regarded as pre-commit errors, e.g. they can cause the transaction to fail or retry.

    • TXNJ-188, TXNJ-192: The TransactionResult.attempts() field now always correctly reflects all attempts made.

    • TXNJ-196: Pre-commit expiry now always triggers rollback.

    • TXNJ-197: Expiry during app-rollback now enters expiry-overtime-mode and has one more attempt at completing rollback.

    • TXNJ-202, TXNJ-200: FAIL_ATR_FULL now handled consistently throughout protocol.

    • TXNJ-210: If anything goes wrong during app-rollback, return success. Rollback will be completed by the asynchronous cleanup process, and unrolled-back metadata is handled by the protocol anyway.

    • TXNJ-211: If rollback fails, the original error that caused the rollback is the one raised as the getCause() of any final exception raised to the application.

    • TXNJ-218: Inserting the same document in the same transaction is now caught as an application bug, and will fail the transaction.

    Improvements

    • TXNJ-189: Adds a transactionId field throughout the metadata, to assist with debugging.

    • TXNJ-190: Write elided stacktraces in the logs to improve readability.

    • TXNJ-191, TXNJ-199: Performance improvement so that transactions are only rolled back and/or cleaned up if they reached the point of creating an ATR entry (e.g. if there is anything to cleanup).

    • TXNJ-204: All context methods, such as ctx.insert and ctx.replace, will now consistently only raise ErrorWrapper exceptions. Applications should not be catching any exceptions from these methods so this is not expected to be impacting.

    • TXNJ-215: Adjust exponential backoff retry parameter from 250msecs down to 100msecs.

    Distributed Transactions Java 1.0.0 (17 January 2020)

    This is the first General Availability (GA) release of Distributed ACID Transactions 1.0.0 for Couchbase Server 6.5.

    Built on Couchbase java-client version 3.0.0 (GA). Requires Couchbase Server 6.5 or above.

    Bug Fixes & Stability

    • TXNJ-119: Calling commit multiple times leads to an error mentioning "rollback"; changed to separate error messages for commit and rollback.

    • TXNJ-126: Expire a transaction immediately if it expires while deferred.

    • TXNJ-149: Improve ambiguous error handling during commit.

    • TXNJ-163, TXNJ-156: Ensure that java-client errors are correctly handled after its changes to error handling.

    • TXNJ-165: Fix reactor.core.Exceptions$OverflowException seen during cleanup. Would only affect users setting a cleanup window of 1 second or less.

    Enhancements

    • TXNJ-173: Register for changes to bucket configurations, so if the application opens a bucket after creating Transactions object, background cleanup will be started on those new buckets.

    • TXNJ-69: Provide some future proofing for client-record. Implementation detail, allows for future extensibility of the background cleanup process.

    • TXNJ-111, TXNJ-110: Sends new events to the java-client event-bus with details from the background cleanup process.

    • TXNJ-120: Once Transactions object is closed, prevent futher operations on it as a sanity check.

    • TXNJ-151: Add Java client metadata to help track failures. Implementation detail, adds some additional info to exceptions to help debugging.

    • TXNJ-152: When transaction expires, attempt to roll it back (or complete commit). Completes the work in TXNJ-124 (in 1.0.0-beta.3), to cover all cases.

    • TXNJ-155: Remove OpenTracing dependency. An implementation detail as it has already been removed from the API, this completes the work and removes the dependency on the jar. Once OpenTelemetry (OpenTracing’s direct replacement) is stable, support for it will be added.

    • TXNJ-162: Performance improvements related to String parsing. Internal benchmarking sees a huge reduction in memory allocations and GC pressure with this change.

    • TXNJ-167: Rebuild on java-client 3.0.0 GA.

    • TXNJ-170: Perform ATR cleanup in serial, which prevents errors being sent to Hooks.onErrorDropped (and by default stdout) when shutting down transactions. Along with java-client fix JVMCBC-781, users should no longer see Hooks.onErrorDropped errors.

    API changes

    • TXNJ-171: Rename MAJORITY_AND_PERSIST_ON_MASTER to MAJORITY_AND_PERSIST_ON_ACTIVE, in accordance with new naming conventions.

    • TXNJ-169: Simplify cleanup window logic to remove divide-by-2; previously, specifying TransactionConfigBuilder.cleanupWindow(120 seconds) would lead to ATRs getting checked every 60 seconds (in the general case): now to reduce confusion they will be checked every 120 seconds.

    • TXNJ-164: Remove TransactionConfigBuilder.logDirectlyCleanup. As all cleanup trace is now sent directly to java-client’s event-bus, it can be configured from there and this parameter is useless.

    Distributed Transactions Java 1.0.0-beta.3 (13th November 2019)

    This is the third Beta release of Distributed ACID Transactions 1.0.0 for Couchbase Server 6.5 and the Java 3.0.0 client. There are no major changes. The main purpose of the release is to rebuild on top of the beta.1 release of the java-client, so applications can use both together.

    Built on Couchbase java-client version 3.0.0-beta.1. Requires Couchbase Server 6.5 or above.

    Bug-fixes And Stability

    • TXNJ-127: Fixes an ArrayIndexOutOfBoundsException error when using the reactive API to perform concurrency inside a txn.

    • TXNJ-145: ATR names are changed, to hash to their expected vbucket. It is recommended to flush (remove all documents from) the bucket to remove ATR documents from previous releases.

    • TXNJ-136: Fixes an erroneous IllegalDocumentState alert in edge-cases involving the DurabilityAmbiguous error.

    New Features and Enhancements

    • TXNJ-124: An aborting transaction will now attempt rollback before completing raising TransactionExpired failure, rather than leaving cleanup solely to cleanup process.

    • TXNJ-131: Minor logging improvement.

    Distributed Transactions Java 1.0.0-beta.2 (9th September 2019)

    This is the second Beta release of Distributed ACID Transactions 1.0.0 for Couchbase Server 6.5 and the Java 3.0.0 client.

    Built on Couchbase java-client version 3.0.0-alpha.7. Requires Couchbase Server 6.5 beta or above.

    Bug-fixes and stability

    • JVMCBC-728: This fix in the underlying Couchbase java-client prevented conflict detection from working correctly in all situations with durability enabled (which it is, by default). All users are strongly recommended to upgrade as soon as possible, for this fix.

    Breaking changes

    As a rule, we do not make breaking API changes after GA release, and only consider them rarely during the beta period. This particular breaking change aligns the transaction and Java libraries much better, makes for a more natural and usable API, and requires little porting work.

    • TXNJ-121: Align with the Couchbase Java client, by renaming the get method to getOptional, getOrError to get, and TransactionJsonDocument to TransactionGetResult.

    Performance

    • TXNJ-40: Retry operations on many transient errors, rather than retrying the full transaction.

    New features

    • TXNJ-85: (Experimental) Supported deferred commit for transactions. Please see the documentation for details. This feature should be regarded as alpha, as its API and functionality may change in the future; please feel free to try it out and provide feedback.

    • TXNJ-112: (Experimental) Allow the number of ATRs to be configured. This can potentially be used to improve transactions throughput, but is currently experimental and should be used only with Couchbase’s guidance.

    • TXNJ-107: Make txn metadata documents more consistently named by prefacing them with "_txn:"

    Distributed Transactions Java 1.0.0-beta.1

    This is the first Beta release of Distributed ACID Transactions 1.0.0 for Couchbase Server 6.5 and the Java 3.0.0 client.

    Built on Couchbase java-client version 3.0.0-alpha.6. Requires Couchbase Server 6.5 beta or above.

    New features

    Bug-fixes and stability

    • TXNJ-47: Improved handling for expiry — the transaction will now try to enter Aborted state

    • TXNJ-50: Intermittent reactive asserts about blocking on a thread from the "parallel" scheduler

    • TXNJ-55: Retry transient errors while rolling back ATR entry, rather than retrying transaction

    • TXNJ-57: Add log redaction for document keys

    • TXNJ-59: Fix an issue with the lost cleanup thread aborting early

    • TXNJ-64: Commit documents in the order they were staged

    • TXNJ-79, TXNJ-81: Improved handling of transient server errors while trying to create ATR entries

    • TXNJ-90: Improved handling of conflicts when multiple applications start at once and try to create the client record

    • TXNJ-96: Improved handling of transient errors removing entries from the client record

    Deprecations and removals

    • TXNJ-92: OpenTracing removed from API — will be re-added when OpenTelemetry is ready