Integrating CMOS with Existing Monitoring Stacks

Developer Preview

March 23, 2025

+ 12

You can integrate CMOS with your existing Prometheus, Alertmanager, and Grafana monitoring system. This tutorial documents how to configure each component in order to achieve this.

Tutorials are provided to demonstrate how a particular problem may be solved. Tutorials are accurate at the time of writing but rely heavily on third party software. The third party software is not directly supported by Couchbase. For further help in the event of a problem, contact the relevant software maintainer.

Pre-requisites

The environment you wish to integrate CMOS into must have:

At least one Couchbase Server node;
Prometheus 2.26 or later, with Alertmanager 0.23.0 or later installed;
Grafana 8.3.0 or later, configured to use the Prometheus server as a Data Source;
- Docker, to run the Cluster Monitor’s Docker image

Installation

Install Cluster Monitor

Currently there is no separate image containing just the Cluster Monitor - you can follow issue CMOS-351 for updates on this option. However, it is possible to use the main CMOS container with all other services disabled for this purpose.

Run the Docker container, using the image from Docker Hub:

console
Copy
docker run -d --rm \
  -p 7196:7196 \ (1)
  -e CB_MULTI_ALERTMANAGER_URLS=http://<Alertmanager IP>:9093 \ (2)
  -e CB_MULTI_ADMIN_USER=admin -e CB_MULTI_ADMIN_PASSWORD=password \
  --name cluster_monitor \
  -e DISABLE_PROMETHEUS=true -e DISABLE_GRAFANA=true \
  -e DISABLE_ALERT_MANAGER=true -e DISABLE_LOKI=true \
  -e DISABLE_JAEGER=true -e DISABLE_CMOSCFG=true -e DISABLE_WEBSERVER=true \
  couchbase/observability-stack:latest

1	If you are using TLS, you will need to add `-p 7197:7197` to also expose port `7197`.
2	If you are not using Alertmanager, set this to a blank value.

Docker networking defaults to bridge, meaning that this container will be able to communicate with any other running Docker containers. If your Grafana, Prometheus / Alertmanager instances are not containerized, ensure that the Cluster Monitor container can communicate with them.

Configure Cluster Monitor

Navigate to the Web GUI, which listens by default on port 7196. Sign in using the credentials you provided as environment variables when you started the Cluster Monitor earlier (by default, admin/password).

Then click "Add Cluster" and enter the IP address of a node in the cluster, along with the cluster’s configured username and password. Repeat this for every cluster you wish to add to the Cluster Monitor.

Configuration

Grafana: Plugins

The instructions for each plugin offer a Grafana Cloud one-click install, a command-line based install for a running instance, or a .zip file which can be unpacked manually into your Grafana plugins directory.

Currently, we require only JSON API, which has associated installation instructions.

Grafana Cloud: search for the plugin marcusolsson-json-datasource and click install
Command-line: grafana-cli plugins install marcusolsson-json-datasource
Manual download: see linked instructions for latest file

However, if you:

Configure Grafana (v7.1+) dashboards through the use of provisioning files: specify the plugins to install in your provisioning configuration file.
Are running Grafana in a Docker container: simply pass through an additional environment variable GF_INSTALL_PLUGINS=$PLUGIN_NAME $VERSION where the $PLUGIN_NAME and latest $VERSION can be found on the plugin’s homepage (linked under "installation instructions" for each).

Grafana: Dashboards

The CMOS dashboards can be found in the official GitHub repository under microlith/grafana/provisioning/dashboards/. Download them and copy them to a folder accessible by your Grafana installation.

The relevant setting in the Grafana configuration file, typically named grafana.ini, is:

console
Copy
[dashboards]
default_home_dashboard_path = <path to dashboards>/couchbase-inventory.json

If you are using Grafana provisioning, you will need to update the providers.options.path argument in your provisioning configuration file (typically named grafana.yml) instead. For example:

console
Copy
providers:
    - name: dashboards
      type: file
      ...
      options:
          path: /etc/grafana/provisioning/dashboards/
      ...

Grafana: Data Sources

Now we have the required plugins and dashboards installed, we need to configure Data Sources.

If adding through the web UI, both Cluster Monitor and Alertmanager will return 404. This is expected and not an issue - the dashboards use sub-paths of these configured URLs which are themselves valid and exist.

Cluster Monitor (JSON API), via the Web UI: click on Add Data Source and select JSON API.
- This must be named Couchbase Cluster Monitor API.
- The URL should point to the Cluster Monitor with a sub-path of /api/v1. The port should be :7196.
- Enable basicAuth, with the user and password you configured.
Prometheus - this should already be configured as a Data Source.
Alertmanager - add a new Data Source in the same way as you did for the Cluster Monitor.
- This should be named Alertmanager API, also of type JSON API, and the URL should point to your Alertmanager instance.
- Configure any authentication as needed.

If you are utilizing Grafana provisioning for dashboards, and would instead like to specify this in a configuration file, then your grafana.yml should look something like this:

The names of these two data sources must match the below snippet exactly, otherwise the dashboards may fail to provision.

console
Copy
    - name: Couchbase Cluster Monitor API
      type: marcusolsson-json-datasource
      uid: PD5070BC1AA9F8304
      url: http://<Cluster Monitor URL>:7196/api/v1
      basicAuth: true
      basicAuthUser: <Cluster Monitor Username>
      basicAuthPassword: <Cluster Monitor Password>
    - name: Alertmanager API
      type: marcusolsson-json-datasource
      uid: PC245499EF542F9C5
      url: http://<Alertmanager URL>/api/v2
      # Configure basicAuth as needed.
...

Prometheus: Scrape config

You will need to modify your existing Prometheus configuration file (typically named prometheus.yml). Add in a scrape_config job for your Couchbase Server nodes:

prometheus.yml

yaml
Copy
global:
  scrape_interval: 30s

scrape_configs:
  - job_name: couchbase-server
    basic_auth:
      username: Administrator (1)
      password: password (2)
    static_configs:
      - targets:
          - <Couchbase Server Node 1 IP>:8091 (3)
          - <Couchbase Server Node 2 IP>:8091
          ...
        labels:
          cluster: "Couchbase Server Cluster" (4)

1	`username` should be set to a Couchbase Server user with at least "External Stats Reader" permission.
2	`password` should be the password of the above user.
3	`targets` should specify the addresses of all the nodes in your cluster. For a Couchbase Server 7+ cluster the port number needs to be the management port: `8091` by default, or `:18091` if you have TLS enabled. For clusters below Couchbase Server 7.0, you should use the port that your Prometheus exporter uses. There are alternative configuration options that do not require hard-coding the addresses - refer to the Prometheus documentation for more details.
4	Any labels defined here will be added to all metrics from this cluster - you may want to configure e.g., `environment`, `datacenter` etc.

The cluster label is required, and must match the name the cluster is configured to use in the Web Console.

Add another scrape_config job for the previously-installed Cluster Monitor:

prometheus.yml

yaml
Copy
...
- job_name: couchbase-cluster-monitor
  basic_auth:
    username: admin  (1)
    password: password (2)
  metrics_path: /api/v1/_prometheus (3)
  static_configs:
    - targets: [<Cluster Monitor IP>:7196] (4)
...

1	`username` should be the Username you configured for the Cluster Monitor.
2	`password` should be the Password you configured for the Cluster Monitor.
3	`metrics_path` is the path to the Prometheus server running on the Cluster Monitor. This should not be changed.
4	`targets` is a list of Cluster Monitor instances for Prometheus to retrieve metrics from. For now this should be a single server IP, listening on port `7196`.

Restart Prometheus, and navigate to the Web UI. If this was successful you should see your cluster listed on the page /targets.

Prometheus: Alerting Rules

CMOS ships with pre-made alert definitions which can alert you about certain events in your cluster. To take advantage of these, first download the rule definitions to somewhere your Prometheus config can access them:

console
Copy
wget -O prometheus-rules.yaml https://raw.githubusercontent.com/couchbaselabs/observability/main/microlith/prometheus/alerting/couchbase/couchbase-rules.yaml

Next, modify your Prometheus configuration file to include this directory. For example, if you downloaded them to /etc/prometheus/alerting/ then you would add:

prometheus.yml

yaml
Copy
# Insert this at the top level of your prometheus.yml
rule_files:
  - /etc/prometheus/alerting/prometheus-rules.yaml

If you have existing alert definitions, simply add the prometheus-rules.yaml file to the same directory, and specify *.yaml as the target for rule_files.

Restart Prometheus once more, and navigate to the Web UI. You should be able to see the new rules listed on the /rules page. You may need to unhide inactive rules.

Alertmanager

We assume that you have already configured Prometheus to send alerts to Alertmanager, and that Alertmanager has appropriate receivers created and configured.

Therefore, there is no additional configuration required.

Next Steps

Head on over to your Grafana Web UI. Navigate to Dashboards, and select Single Cluster Overview.