Metrics Reporting

Individual request tracing presents a very specific (though isolated) view of the system. In addition, it also makes sense to capture information that aggregates request data (i.e. requests per second), but also data which is not tied to a specific request at all (i.e. resource utilization).

The SDK exposes metrics for operation durations, broken down into p50, p90, p99, p99.9, and p100 percentiles.

These metrics can either be logged periodically into the application logs, using the LoggingMeter (this is the default behaviour).

Or, sent into the OpenTelemetry or Micrometer libraries, where they can be sent on to the user’s metrics infrastructure — such as Prometheus.

The Default LoggingMeter

As of v4.7.0 the LoggingMeter is native to the Node.js SDK. In previous versions the underlying C++ core was responsible for all metrics related behaviour.

The default implementation aggregates and logs request and response metrics.

By default the metrics will be emitted every 10 minutes, but you can customize the emit interval as well:

const cluster = await couchbase.connect('couchbase://your-ip', {
  username: 'Administrator',
  password: 'password',
  metricsConfig: {
    emitInterval: 5 * 60 * 1000, // 5 minutes in milliseconds
  }
})

Once enabled, there is no further configuration needed. The LoggingMeter will emit the collected request statistics every interval. A possible report looks like this (prettified for better readability):

{
   "meta":{
      "emit_interval_s":10
   },
   "query":{
      "127.0.0.1":{
         "total_count":9411,
         "percentiles_us":{
            "50.0":544.767,
            "90.0":905.215,
            "99.0":1589.247,
            "99.9":4095.999,
            "100.0":100663.295
         }
      }
   },
   "kv":{
      "127.0.0.1":{
         "total_count":9414,
         "percentiles_us":{
            "50.0":155.647,
            "90.0":274.431,
            "99.0":544.767,
            "99.9":1867.775,
            "100.0":574619.647
         }
      }
   }
}

Each report contains one object for each service that got used and is further separated on a per-node basis so they can be analyzed in isolation.

For each service / host combination, a total amount of recorded requests is reported, as well as percentiles from a histogram in microseconds. The meta section on top contains information such as the emit interval in seconds so tooling can later calculate numbers like requests per second.

The LoggingMeter can be configured via MetricsConfig as shown above. The following table shows the currently available properties:

Table 1. MetricsConfig Properties
Property	Default	Description
`enableMetrics`	true	If the `LoggingMeter` should be enabled.
`emitInterval`	600 seconds	The interval at which metrics are emitted.

OpenTelemetry Integration

The SDK supports plugging in any OpenTelemetry metrics consumer instead of using the default LoggingMeter.

To do this, install the required Node.js libraries for OpenTelemetry:

The Node.js SDK specifies the OpenTelemetry API package as an optional peer dependency.

$ npm install @opentelemetry/api @opentelemetry/sdk-node @opentelemetry/sdk-metrics-node

In addition, you’ll need to get the metrics data into your metrics backend. This is often done by having the metrics backend (such as Prometheus) regularly gather, or 'scrape', the metrics data.

There are multiple approaches here. The opentelemetry-exporter-prometheus library makes it possible to open an HTTP server in the application that Prometheus can then scape.

As that library is in alpha, here we will instead show how to send OpenTelemetry metrics into opentelemetry-collector, where it can be scraped by Prometheus or another metrics backend.

$ npm install @opentelemetry/exporter-metrics-otlp-grpc

This aligns well with tracing, where a recommended approach is also to send OpenTelemetry spans into opentelemetry-collector, where they can be processed and forwarded elsewhere. See the Request Tracing documentation for more information.

For metrics, add this logic to the application:

// create service resource
const customResource = resourceFromAttributes({
  [ATTR_SERVICE_NAME]: SERVICE_NAME,
})
const resource = defaultResource().merge(customResource)

// setup an exporter
// This exporter exports traces on the OTLP protocol over GRPC to localhost:4317.
const metricExporter = new OTLPMetricExporter({
  url: 'http://localhost:4317',
})

// create the meter provider with the resource and periodic reader
const meterProvider = new MeterProvider({
  resource: resource,
  readers: [
    new PeriodicExportingMetricReader({
      exporter: metricExporter,
      exportIntervalMillis: 1000, // Export metrics every 1000  milliseconds
    }),
  ],
})

// Wrap the OTel meter with the Couchbase SDK wrapper
const couchbaseMeter = getOTelMeter(meterProvider)

const cluster = await connect(CONNECTION_STRING, {
  username: USERNAME,
  password: PASSWORD,
  meter: couchbaseMeter, // Inject the SDK meter
})

At this point the SDK is hooked up with the OpenTelemetry metrics and will emit them to the exporter.

A db.client.operation.duration histogram is exported, which will appear in Prometheus as db_client_operation_duration_seconds_bucket.

It has these tags (and more): db_system_name="couchbase" and couchbase_service ("kv", "query", etc.) and db_operation_name ("upsert", "query", etc.)

Testing

For convenience, here is a simple Docker-based configuration of opentelemetry-collector and Prometheus for localhost testing of an OpenTelemetry setup.

Create file otel-collector-config.yaml:

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317

exporters:
  prometheus:
    endpoint: "0.0.0.0:8889"

service:
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [prometheus]

And file prometheus.yml:

global:
  scrape_interval: 5s

scrape_configs:
  - job_name: 'otel-collector'
    static_configs:
      - targets: ['otel-collector:8889']

And file docker-compose.yml:

services:
  otel-collector:
    image: otel/opentelemetry-collector:latest
    command: ["--config=/etc/otel-collector-config.yml"]
    volumes:
      - ./otel-collector-config.yml:/etc/otel-collector-config.yml
    ports:
      - "4317:4317" # OTLP gRPC receiver (Node.js sends here)
      - "8889:8889" # Prometheus exporter (Prometheus scrapes here)

  prometheus:
    image: prom/prometheus:latest
    command: ["--config.file=/etc/prometheus/prometheus.yml"]
    volumes:
      - ./prometheus.yml:/etc/prometheus/prometheus.yml
    ports:
      - "9090:9090" # Prometheus UI (You view this in your browser)
    depends_on:
      - otel-collector

Now start up the containers:

$ docker-compose up -d

Some things to note:

The containers are put on the same network so they can refer to each other by container name.
The app has been told to export metrics over OLTP GRPC to localhost:4317. opentelemetry-collector is listening to this.
opentelemetry-collector will store the metrics, and exposes port 8889 for Prometheus to periodically scrape.

Now run the application. All being well, Prometheus (the UI is available on http://localhost:9090) should allow querying for db_client_operation_duration_seconds_bucket. (Though a real deployment will generally use another tool, such as Grafana, for visualisation.)

If this fails, check http://localhost:9090/api/v1/targets to see if Prometheus is unable to contact opentelemetry-collector.