Function: Convert Bucket to Collections

  • Capella Operational
      +

      Goal: Demonstrate Converting "upgraded" Buckets to Collections.

      • This function ConvertBucketToCollections demonstrates a simple technique to reorganize into buckets.

      • Requires Eventing Storage (or metadata collection) and a "source" collection listing to the sample "beer-sample".

      • The sample "beer-sample" is equivalent to an upgraded bucket where all data resides in beer-sample._default._default.

      • Needs Bindings of type "Constant alias" (as documented in the Scriptlet).

      • Needs Bindings of type "bucket alias" (as documented in the Scriptlet).

      • If "Constant alias" DO_COPY is true

        • move all type` = "brewery" to COLLECTION bulk.data.brewery

        • move all type` = "beer" to COLLECTION bulk.data.beer

      • If "Constant alias" DO_DELETE is true

        • remove all type` = "brewery" from upgrade COLLECTION beer-sample._default._default

        • remove all type` = "beer" from upgrade COLLECTION beer-sample._default._default

      • The function should have higher throughput as you add more workers up to the # of vCPUs.

      • Example of performance using this technique:

        • Test cluster is a symmetric 3 x AWS r5.2xlarge (64 GiB of memory, 8 vCPUs, 64-bit platform).

        • Will process 93K ops/sec in a steady state.

        • 250M small documents: takes 44 minutes to reorganize a bucket with 80 types into a new bucket with 80 collections.

        • 1B small documents: takes 3 hours to reorganize a bucket with 80 types into a new bucket with 80 collections.

      If you alter this function and attempt to run this Eventing function in a single bucket you will have to disable recursion checks. Refer to Disabling infinite recursion checks. In this case, always test your Eventing Function on a non-production system to ensure you do not mistakenly create an infinite recursion loop.
      • ConvertBucketToCollections

      • Input Data/Mutation

      • Output Data/Logged

      // To run configure the settings for this Function, ConvertBucketToCollections, as follows:
      //
      // Version 7.1+
      //   "Function Scope"
      //     *.* (or try beer-sample.data if non-privileged)
      // Version 7.0+
      //   "Listen to Location"
      //     beer-sample.data.source
      //   "Eventing Storage"
      //     rr100.eventing.metadata
      //   Binding(s)
      //    1. "binding type", "alias name...", "bucket.scope.collection",       "Access"
      //       "bucket alias", "src_col",       "beer-sample._default._default", "read and write"
      //       "bucket alias", "brewery_col",   "bulk.data.brewery",             "read and write"
      //       "bucket alias", "beer_col",      "bulk.data.beer",                "read and write"
      //
      //    2. "binding type",   "alias name...", "value",
      //       "Constant alias", "DO_COPY",       true
      //       "Constant alias", "DO_DELETE",     true
      //
      // Version 6.6.2 (not applicable)
      
      // Upgrades `beer-sample` from a bucket paradigm to a collection/keyspace paradigm.
      
      function OnUpdate(doc, meta) {
          if (doc.type === 'beer') {
              if (DO_COPY) beer_col[meta.id] = doc;
              if (DO_DELETE) {
                  if (!beer_col[meta.id]) { // safety check
                      log("skip delete copy not found type=" + doc.type + ", meta.id=" + meta.id);
                  } else {
                      delete src_col[meta.id];
                  }
              }
          }
          if (doc.type === 'brewery') {
              if (DO_COPY) brewery_col[meta.id] = doc;
              if (DO_DELETE) {
                  if (!brewery_col[meta.id]) {  // safety check
                      log("skip delete copy not found type=" + doc.type + ", meta.id=" + meta.id);
                  } else {
                      delete src_col[meta.id];
                  }
              }
          }
      }
      LOAD THE SAMPLE `beer-sample` from the UI Settings page
      CREATE two empty keyspaces `bulk`.`data`.`brewery` and `bulk`.`data`.`beer`
      THE `beer-sample`.`_default`.`_default` keyspace is empty (0 documents)
      The keyspace `bulk`.`data`.`brewery` has 1,412 documents of type == "brewery"
      The keyspace `bulk`.`data`.`beer` has 5,891 documents of type == "beer"