Cleanup

      +
      The SDK takes care of failed or lost transactions, using an asynchronous cleanup background task.

      Transactions will try to clean up after themselves in the advent of failures. However, there are situations that inevitably created failed, or 'lost' transactions, such as an application crash.

      This requires an asynchronous cleanup task, described in this section.

      Background Cleanup

      The first transaction triggered by an application will spawn a background cleanup task, whose job it is to periodically scan for expired transactions and clean them up. It does this by scanning a subset of the Active Transaction Record (ATR) transaction metadata documents, for each collection used by any transactions.

      The default settings are tuned to find expired transactions reasonably quickly, while creating negligible impact from the background reads required by the scanning process. To be exact, with default settings it will generally find expired transactions within 60 seconds, and use less than 20 reads per second, per collection of metadata documents being checked. This is unlikely to impact performance on any cluster, but the settings may be tuned as desired.

      All applications connected to the same cluster and running transactions will share in the cleanup, via a low-touch communication protocol on the _txn:client-record metadata document that will be created in each collection in the cluster involved with transaction metadata. This document is visible and should not be modified externally as it is maintained automatically. All ATRs will be distributed between all cleanup clients, so increasing the number of applications will not increase the reads required for scanning.

      An application may cleanup transactions created by another application.

      It is important to understand that if an application is not running, then cleanup is not running. This is particularly relevant to developers running unit tests or similar.

      Configuring Cleanup

      The cleanup settings can be configured as so:

      var cluster = Cluster.connect("localhost", ClusterOptions.clusterOptions("username", "password")
              .environment(env -> env.transactionsConfig(TransactionsConfig.cleanupConfig(TransactionsCleanupConfig
                      .cleanupClientAttempts(true)
                      .cleanupLostAttempts(true)
                      .cleanupWindow(Duration.ofSeconds(120))
                      .addCollections(List.of(keyspace))))));

      The settings supported by TransactionsCleanupConfig are:

      Setting Default Description

      cleanupWindow

      60 seconds

      This determines how long a cleanup 'run' is; that is, how frequently this client will check its subset of ATR documents. It is perfectly valid for the application to change this setting, which is at a conservative default. Decreasing this will cause expiration transactions to be found more swiftly (generally, within this cleanup window), with the tradeoff of increasing the number of reads per second used for the scanning process.

      cleanupLostAttempts

      true

      This is the thread that takes part in the distributed cleanup process described above, that cleans up expired transactions created by any client. It is strongly recommended that it is left enabled.

      cleanupClientAttempts

      true

      This thread is for cleaning up transactions created just by this client. The client will preferentially aim to send any transactions it creates to this thread, leaving transactions for the distributed cleanup process only when it is forced to (for example, on an application crash). It is strongly recommended that it is left enabled.

      addCollections

      empty

      Adds additional collections to the 'cleanup set' - the set of collections being cleaned up.

      Monitoring Cleanup

      If an application needs to monitor cleanup, it may subscribe to these events:

      cluster.environment().eventBus().subscribe(event -> {
          if (event instanceof TransactionCleanupAttemptEvent || event instanceof TransactionCleanupEndRunEvent) {
              // log event for review
          }
      });

      TransactionCleanupEndRunEvent is raised whenever a current 'run' is finished, and contains statistics from the run. (A run is typically around every 60 seconds, with default configuration.)

      A TransactionCleanupAttemptEvent event is raised when an expired transaction was found by this process, and a cleanup attempt was made. It contains whether that attempt was successful, along with any logs relevant to the attempt.

      In addition, if cleanup fails to cleanup a transaction that is more than two hours past expiry, it will raise the TransactionCleanupAttemptEvent event at WARN level (rather than the default DEBUG). With most default configurations of the event-bus, this will cause that event to be logged somewhere visible to the application. If there is not a good reason for the cleanup to be failed (such as a downed node that has not yet been failed-over), then the user is encouraged to report the issue.