Application Telemetry
- reference
You can enable application telemetry to have Couchbase Server periodically collect telemetry from your clients that use the Couchbase SDK.
Description
Having Couchbase Server collect telemetry from your applications can help you troubleshoot client issues. This telemetry data is useful to diagnose issues such as poor performance or timeouts.
When you enable application telemetry, Couchbase Server advertises to SDK clients that it can collect telemetry data. When an SDK client connects to a cluster with application telemetry enabled, it opens a WebSocket connection to a node in the cluster. Couchbase Server uses this connection to periodically gather telemetry data from the client in Prometheus format.
| Application telemetry is off by default in Couchbase Server 8.0. Future versions of Couchbase Server may enable it by default. |
Couchbase Server reports the telemetry data it collects through the same Prometheus metrics endpoint it uses to publish its own metrics. Couchbase Server reports aggregated application telemetry metrics instead of reporting metrics on a per-client basis. See Configure Prometheus to Collect Couchbase Metrics to learn how to set up Prometheus to collect metrics from your Couchbase Server cluster.
Prerequisites
Your Couchbase Server cluster and your clients must meet the following requirements to use application telemetry:
-
A Couchbase Server cluster only supports application telemetry when all of its nodes are running version 8.0 or later. Earlier versions of Couchbase Server do not support application telemetry. You cannot enable application telemetry if your cluster is running in mixed mode where all nodes are not running the same version of Couchbase Server.
-
Your applications must use a recent SDK version that supports application telemetry. The following table lists the SDKs that support application telemetry along with the version where they added support.
SDK Minimum Version with Application Telemetry Support 3.8
1.2
2.11
3.9
3.9
4.6
4.4
4.5
3.7
3.9
-
Your clients must be able to connect to the node’s management port to create the WebSocket connection for telemetry data collection. The default management port is 8091 for unencrypted connections and 18901 for encrypted connections. Make sure any firewall rules between your clients and the nodes allow traffic on the management port.
Get Application Telemetry Status
The following method gets the current application telemetry settings the cluster.
GET /settings/appTelemetry
curl Syntax
curl -sS -u $USER:$PASSWORD \
-X GET 'http[s]://{host}:{port}/settings/appTelemetry'
Path and curl Parameters
USER-
The name of a user who has one of the roles listed in Required Privileges.
PASSWORD-
The password for the
user. host-
Hostname or IP address of a Couchbase Server node.
port-
Port number for the REST API. Defaults are 8091 for unencrypted and 18901 for encrypted connections.
Required Privileges
Your user account must have at least 1 of the following roles to get the application telemetry settings:
Responses
200 OK-
Returned when the call is successful. The response body contains a JSON object with the following fields:
-
enabled: whether application telemetry is enabled or not. -
maxScrapeClientsPerNode: the maximum number of clients a single node can scrape telemetry data from at the same time. -
scrapeIntervalSeconds: how often Couchbase Server scrapes telemetry data from the clients, in seconds.
-
403 Forbidden-
Returned if you do not have the proper roles to call this API. See Required Privileges.
Examples
The following example gets the cluster’s current application telemetry setting from the local node and pipes the result through jq.
curl -sX GET -u Administrator:password \
'http://localhost:8091/settings/appTelemetry' | jq
Running the previous command returns a JSON object similar to the following:
{
"enabled": false,
"maxScrapeClientsPerNode": 1024,
"scrapeIntervalSeconds": 60
}
Configure Application Telemetry
By sending a POST request to the /settings/appTelemetry endpoint, you can:
-
Turn application telemetry on or off.
-
Set the limit on the number of clients a single node can scrape telemetry data from at the same time.
-
Set how often the nodes scrape telemetry data from clients.
POST /settings/appTelemetry
curl Syntax
curl -sS -u $USER:$PASSWORD \
-X POST http://{host}:{port}/settings/appTelemetry \
[-d enabled=[true|false]] \
[-d maxScrapeClientsPerNode=<integer>] \
[-d scrapeIntervalSeconds=<integer>]
Path and curl Parameters
USER-
The name of a user who has one of the roles listed in Required Privileges.
PASSWORD-
The password for the
user. host-
Hostname or IP address of a Couchbase Server node.
port-
Port number for the REST API. Defaults are 8091 for unencrypted and 18901 for encrypted connections.
REST Parameters
enabled(Boolean, optional)-
Set to
trueto enable application telemetry orfalseto turn it off.Defaults to
false(off). When you enable application telemetry, Couchbase Server advertises to SDK clients that it can collect telemetry data.Future versions of Couchbase Sever may enable application telemetry by default. maxScrapeClientsPerNode(integer, optional)-
Sets the maximum number of clients a single node can scrape telemetry data from at the same time. If the number of client telemetry connections reaches this threshold, the node rejects new telemetry connections until the number of connected clients drops.
Valid values are from
1to1024.The default value is
1024. You can setmaxScrapeClientsPerNodeto a lower value to reduce the number of clients that can connect to each node. Reducing the number of clients can potentially reduce the overhead of collecting telemetry data on your nodes if you have a large number of clients.If a node reaches this limit, it starts rejecting new telemetry connections. Rejected clients can attempt to connect to another node in the cluster.
If all nodes in the cluster reach this limit, newly connected clients are not able to connect to a node to have their telemetry collected.
You can monitor the number of application telemetry connections by viewing the cm_app_telemetry_curr_connectionsmetric. See Metrics Reference and Configure Prometheus to Collect Couchbase Metrics for more information about metrics. scrapeIntervalSeconds(integer, optional)-
Sets how often the nodes scrape telemetry data from clients in seconds.
Valid values are
60to600.The default value is
60.You can increase this value to reduce the overhead of collecting telemetry data on your nodes. However, increasing this value means Couchbase Server will miss collecting more telemetry data from clients before they disconnect. For example, suppose you set this value to
300. Then Couchbase Server could lose up to 5 minutes of telemetry data from a client that disconnects just before the next scheduled telemetry collection.If your applications have short-lived client connections to the cluster, consider keeping this value low to increase the chances of collecting telemetry before the clients disconnect.
Required Privileges
Your user account must have at least 1 of the following roles to configure application telemetry:
Responses
200 OK-
Returned when the call is successful. A successful call also returns a JSON object with the new application telemetry settings. This object has the same format as the response from the GET method.
400 Bad Request-
Returned if you attempt to enable application telemetry on a cluster that’s running in mixed mode where some nodes are running a version earlier than 8.0. All of the nodes in the cluster must be running version 8.0 or later to enable application telemetry. See Prerequisites for more requirements.
403 Forbidden-
Returned if you do not have the proper roles to call this API. See Required Privileges for a list of the required roles.
404 Not Found-
Returned if you attempt to call the endpoint on a running a version of Couchbase Server earlier than 8.0.
Examples
The following example enables telemetry, limits the node to scraping data from 512 clients, and sets the scrape interval to 90 seconds.
It pipes the result through jq to make it easier to read.
curl -X POST -u Administrator:password \
http://localhost:8091/settings/appTelemetry \
-d enabled=true \
-d maxScrapeClientsPerNode=512 \
-d scrapeIntervalSeconds=90 | jq
If successful, the previous command returns the following JSON object containing the new application telemetry settings for the cluster:
{
"enabled": true,
"maxScrapeClientsPerNode": 512,
"scrapeIntervalSeconds": 90
}