Troubleshooting and Best Practices

What happens when more Workers are allocated for a Function?

Couchbase Server for a specific Function limits the maximum number of workers to 64 (note the default is 3 workers). This upper limit is configured for system optimization purposes. You cannot create a function with more than this upper bound.

When deploying (or resuming a paused function) a threshold is dynamically calculated based on node’s resources and if the number of workers exceeds this calculation, then the system automatically generates a warning message but does not prevent the Function deployment. An example follows:

There are 104 eventing workers configured to run on 24 cores. A sizing exercise is recommended.

When should developers use the try-catch block in Function handlers?

As a best practice, while programming the Function handler code, for basic error handling and debugging operations, it is recommended that application developers use the try-catch block.

Before deployment, Couchbase Server verifies the Function handler code. Only valid Functions get deployed. Using the log() option within a try-catch block(s), you can record errors. These error logs get stored in the Eventing function’s application log file. Note the venting function’s application log file on disk is specific to the node that processed the mutation and is not global across the cluster. By default, JavaScript runtime errors get stored in the system logs. Unlike system logs, troubleshooting and debugging operations are easy when you use the try-catch block and application log() options.

During runtime, Application logs, by default, do not capture any handler code exceptions. To log exceptions, it is recommended to encapsulate your code in a try-catch block.

A sample try-catch block is provided for reference:

function OnUpdate(doc, meta) {
    log('document', doc);
    try {
        var time_rand = random_gen();
        dst_bucket[meta.id + time_rand] = doc;
    } catch(e) {
        log(e);
    }
}

What are bucket alias considerations during a Function definition?

Function handlers can trigger data mutations. To avoid a cyclic generation of data changes, ensure that you carefully consider the below aspects while specifying source and destination buckets:

  • Avoid infinite recursions. If you are using a series of handlers, then ensure that destination buckets to which event handlers perform a write operation, do not have other Function handlers configured that can create a loop by triggering cyclic mutations. For example the following design demonstrates an infinite recursion:

    functionA with source bucketA target bucketB aliased as same.

    function OnUpdate(doc, meta) {
        bucketB[meta.id] = {"status":"updated by functionA"};
    }

    functionB with source bucketB target bucketC aliased as same.

    function OnUpdate(doc, meta) {
        bucketC[meta.id] = {"status":"updated by functionB"};
    }

    functionC with source bucketC target bucketA aliased as same.

    function OnUpdate(doc, meta) {
        bucketA[meta.id] = {"status":"updated by functionC"};
    }

    In the example above a single mutation in "bucketA" will create an infinite loop updating a record in "bucketB", then in "bucketC", then back to "bucketA", over and over.

    One possible solution is to change the design above such that functionC updated bucketD (instead of bucketA) we would have no recursion as follows:

    functionC (modified to write to a different bucket) with source bucketA target bucketD aliased as same.

    function OnUpdate(doc, meta) {
        bucketD[meta.id] = {"status":"updated by functionC"};
    }

    Another possible solution to the design above is to change the design is changes such that functionA performs a check to ensure that if functionC has operated on the document to cease any new mutations as follows:

    functionA (modified to stop recursion) with source bucketA target bucketB aliased as same.

    function OnUpdate(doc, meta) {
        if (doc["status"] == "updated by functionC") return;
        bucketB[meta.id] = {"status":"updated by functionA"};
    }
  • Although the Couchbase Server can flag simple infinite recursions a long chain of source and destination buckets with a series of handlers, a complex infinite recursion condition may occur. The developer, carefully consider and avoid these cases.

  • As a best practice, ensure that buckets to which the Function handler performs a write operation do not have other handlers configured for tracking data mutations.

There is a special case of direct self-recursion, which is highly useful, when a handler chooses to create a Read-Write binding to its own source bucket we can perform document enrichment operations. In this case the direct self-recursive mutations and detected and suppressed by the Eventing framework. However this capability is only supported for the aliased JavaScript map and is not supported for mutations generated via N1QL.

In the 6.5 release, the handler code can directly mutate (or write back) to the source bucket, e.g. direct self-recursion.
  • For example the following design is taken from the Data Enrichment, Case: 2:

    functionDirectEnrich with source bucketA target bucketA aliased as 'src'

    function OnUpdate(doc, meta) {
      log('document', doc);
      doc["ip_num_start"] = get_numip_first_3_octets(doc["ip_start"]);
      doc["ip_num_end"]   = get_numip_first_3_octets(doc["ip_end"]);
      // !!! write back to the source bucket !!!
      src[meta.id]=doc;
    }
    function get_numip_first_3_octets(ip) {
      var return_val = 0;
      if (ip) {
        var parts = ip.split('.');
        //IP Number = A x (256*256*256) + B x (256*256) + C x 256 + D
        return_val = (parts[0]*(256*256*256)) + (parts[1]*(256*256)) + (parts[2]*256) + parseInt(parts[3]);
        return return_val;
      }
    }

In the cluster, I notice a sharp increase in the Timeouts Statistics. What are my next steps?

When the Timeout Statistics shows a sharp increase, it may be due to two possible scenarios:

  • Increase in execution time: When the handler execution time increases, the Function execution latency gets affected, and this in turn, leads to Function backlog and failure conditions.

  • Script timeout value: When the script timeout attribute value is not correctly configured, then you encounter timeout conditions frequently.

As a workaround, it is recommended to increase the script timeout value. Ensure that you configure the script timeout value after carefully evaluating the execution latency of the Function.

As a best practice use a combination of try-catch block and the application log options. This way you can monitor, debug and troubleshoot errors during the Function execution.