Orphaned Requests Logging

  • how-to
    +
    In addition to request tracing and metrics reporting, logging orphaned requests provides additional insight into why an operation might have timed out (or got cancelled for a different reason).

    While tracing and metrics can also be consumed through external interfaces, getting information about orphaned requests only works through the built-in mechanism.

    The way it works is that every time a response is in the process of being completed, when the SDK detects that the original caller is not listening anymore (likely because of a timeout), it will send this "orphaned" response to a reporting utility which aggregates all responses and in regular intervals logs them in a specific format.

    When you spot a TimeoutException in your log file, you can look for the output of the OrphanReporter and correlate the information.

    Output Format

    Since orphans usually indicate a state that is not desirable, the log level for those events is WARN. By default they will be aggregated and logged every 10 seconds, if there are orphans to report. This makes sure that the log line will appear close to the corresponding LCB_ERR_TIMEOUT in the logs, while not spamming the log file if there is nothing to report. See the next section on how to customize this behavior.

    The actual body of the message consists of the text Orphaned requests found, followed by a compact JSON representation of the aggregated orphans. The following code snippet shows a prettified version of such a JSON blob:

    {
      "<service-a>": {
        "total_count": 1234,
        "top_requests": [{<entry>}, {<entry>},...]
      },
      "<service-b>": {
        "total_count": 1234,
        "top_requests": [{<entry>}, {<entry>},...]
      },
    }

    Each individual request has the following format:

    {
      "total_duration_us": 1200,
      "encode_duration_us": 100,
      "last_dispatch_duration_us": 40,
      "total_dispatch_duration_us": 40,
      "last_server_duration_us": 2,
      "timeout_ms": 75000,
      "operation_name": "upsert",
      "last_local_id": "66388CF5BFCF7522/18CC8791579B567C",
      "operation_id": "0x23",
      "last_local_socket": "10.211.55.3:52450",
      "last_remote_socket": "10.112.180.101:11210"
    }
    Table 1. Experimental JSON Output Format Descriptions
    Property Description

    total_duration_us

    The duration of the orphaned request.

    encode_duration_us

    The duration of the encode span, if present.

    last_dispatch_duration_us

    The duration of the last dispatch span if present.

    total_dispatch_duration_us

    The duration of all dispatch spans, summed up.

    last_server_duration_us

    The server duration attribute of the last dispatch span, if present.

    operation_name

    The name of the outer request span, with “cb.” prefix removed.

    last_local_id

    The local_id from the last dispatch span, if present.

    operation_id

    The operation_id from the outer request span, if present.

    last_local_socket

    The local_address from the last dispatch span, if present.

    last_remote_socket

    The remote_address from the last dispatch span, if present.

    timeout_ms

    The operation timeout in milliseconds.

    If a field is not available, it will not be included in the output.

    Configuration

    The orphan logger can be configured through the connection string, or using lcb_cntl calls. See the tracing orphaned logging options for details.

    The following properties can be configured:

    Table 2. OrphanReporterConfig Properties
    Property Default Description

    emitInterval

    10 seconds

    The interval where found orphans are emitted.

    sampleSize

    128

    The number of samples to store per service.