Cluster Manager Metrics

  • Capella Columnar
    March 23, 2025
    + 12
    A list of the metrics provided by the Cluster Manager.
    Tip
    • The type / unit badge shows shows the Prometheus type and unit (if present).

    audit_queue_length    gauge

    Current number of entries in the audit queue

    audit_unsuccessful_retries    counter

    Failed attempts to audit

    cm_auth_cache_current_items    gauge

    Current number of items available in cbauth auth cache

    cm_auth_cache_hit_total    counter

    Total number of cbauth auth cache hits

    cm_auth_cache_max_items    gauge

    Maximum capacity of cbauth auth cache

    cm_auth_cache_miss_total    counter

    Total number of cbauth auth cache misses

    cm_auto_failover_count    gauge

    Number of auto-failovers

    cm_auto_failover_enabled    gauge

    Indicates if auto-failover is enabled (1 = true, 0 = false)

    cm_auto_failover_max_count    gauge

    Maximum number of auto-failovers before being disabled

    cm_build_streaming_info_total    counter

    Number of streaming requests processed

    cm_client_cert_cache_current_items    gauge

    Current number of items available in cbauth client_cert cache

    cm_client_cert_cache_hit_total    counter

    Total number of cbauth client_cert cache hits

    cm_client_cert_cache_max_items    gauge

    Maximum capacity of cbauth client_cert cache

    cm_client_cert_cache_miss_total    counter

    Total number of cbauth client_cert cache misses

    cm_erlang_port_count    gauge

    The number of ports in use by the erlang VM

    cm_erlang_port_limit    gauge

    The maximum number of ports that the erlang VM can use

    cm_erlang_process_count    gauge

    The number of processes in use by the erlang VM

    cm_erlang_process_limit    gauge

    The maximum number of processes that the erlang VM can use

    cm_failover_total    counter

    Number of non-graceful failover results

    cm_gc_duration_seconds    histogram

    Time to perform erlang garbage collection

    cm_graceful_failover_total    counter

    Number of graceful failover results

    cm_http_requests_seconds    histogram

    Number of bucket HTTP requests

    cm_http_requests_total    counter

    Total number of HTTP requests categorized

    cm_is_balanced    gauge

    Indicates if cluster is balanced (1 = true, 0 = false). Only reported by the orchestrator node and only updated once every 30 seconds

    cm_logs_total    counter

    Total number of logs logged

    cm_memcached_call_time_seconds    histogram

    Amount of time to call memcached

    cm_memcached_cmd_total    gauge

    Total number of memcached commands

    cm_memcached_e2e_call_time_seconds    histogram

    End to end memcached call times

    cm_memcached_q_call_time_seconds    histogram / seconds

    Memcached queue call times

    cm_mru_cache_add_time_seconds    histogram / seconds

    Time to add to MRU cache

    cm_mru_cache_flush_time_seconds    histogram / seconds

    Time to flush MRU cache

    cm_mru_cache_lock_time_seconds    histogram / seconds

    Time to lock MRU cache

    cm_mru_cache_lookup_time_seconds    histogram / seconds

    Time to perform a lookup in the MRU cache

    cm_mru_cache_lookup_total    counter

    Total number of MRU cache lookups

    cm_mru_cache_take_lock_total    counter

    Total number of times MRU cache lock was obtained

    cm_odp_report_failed    counter

    Number of failures to send on-demand pricing report

    cm_outgoing_http_requests_seconds    histogram / seconds

    Time taken for outgoing HTTP requests

    cm_outgoing_http_requests_total    counter

    Total number of outgoing HTTP requests

    cm_rebalance_in_progress    gauge

    Indicates if a rebalance is running (1 = true, 0 = false). Only reported by the orchestrator node.

    cm_rebalance_progress    gauge / ratio

    Estimate of the rebalance progress (0 - 1) for each stage. Only reported by the orchestrator node.

    cm_rebalance_total    counter

    Number of rebalance results

    cm_request_hibernates_total    counter

    Number of times requests were hibernated

    cm_request_unhibernates_total    counter

    Number of times requests were unhibernated

    cm_rest_request_access_forbidden_total    counter

    Number of REST requests failing due inadequate permissions

    cm_rest_request_auth_failure_total    counter

    Number of REST requests failing authentication

    cm_rest_request_enters_total    counter

    Number of REST requests to enter ns_server

    cm_rest_request_failure_total    counter

    Number of REST requests failing (see specific code)

    cm_rest_request_leaves_total    counter

    Number of REST requests to exit ns_server

    cm_status_latency_seconds    histogram / seconds

    Latency time for status

    cm_up_cache_current_items    gauge

    Current number of items available in cbauth up cache

    cm_up_cache_hit_total    counter

    Total number of cbauth up cache hits

    cm_up_cache_max_items    gauge

    Maximum capacity of cbauth up cache

    cm_up_cache_miss_total    counter

    Total number of cbauth up cache misses

    cm_user_bkts_cache_current_items    gauge

    Current number of items available in cbauth bkts cache

    cm_user_bkts_cache_hit_total    counter

    Total number of cbauth bkts cache hits

    cm_user_bkts_cache_max_items    gauge

    Maximum capacity of cbauth bkts cache

    cm_user_bkts_cache_miss_total    counter

    Total number of cbauth bkts cache misses

    cm_uuid_cache_current_items    gauge

    Current number of items available in cbauth uuid cache

    cm_uuid_cache_hit_total    counter

    Total number of cbauth uuid cache hits

    cm_uuid_cache_max_items    gauge

    Maximum capacity of cbauth uuid cache

    cm_uuid_cache_miss_total    counter

    Total number of cbauth uuid cache misses

    cm_web_cache_hits_total    counter

    Total number of web cache hits

    cm_web_cache_inner_hits_total    counter

    Total number of inner web cache hits

    cm_web_cache_updates_total    counter

    Total number of web cache updates

    couch_docs_actual_disk_size    gauge / bytes

    Amount of disk space used by the Data Service

    couch_views_actual_disk_size    gauge / bytes

    Amount of disk space used by Views data

    sys_allocstall    counter

    Number of alloc stalls

    sys_cpu_burst_rate    Deprecatedgauge

    Rate at which CPUs overran their quota

    sys_cpu_cgroup_seconds_total    counter / seconds

    Number of CPU seconds utilized in the cgroup, by mode

    sys_cpu_cgroup_usage_seconds_total    counter / seconds

    Number of 'user' and 'system' CPU seconds utilized in the cgroup

    sys_cpu_cores_available    gauge

    Number of available CPU cores in the control group

    sys_cpu_host_cores_available    gauge

    Number of available CPU cores in the host

    sys_cpu_host_idle_rate    Deprecatedgauge

    Idle CPU utilization rate in the host

    sys_cpu_host_other_rate    Deprecatedgauge

    Other (not idle/user/sys/irq/stolen) CPU utilization rate in the host

    sys_cpu_host_seconds_total    counter / seconds

    Number of CPU seconds utilized in the host, by mode

    sys_cpu_host_sys_rate    Deprecatedgauge

    System CPU utilization rate in the host

    sys_cpu_host_user_rate    Deprecatedgauge

    User space CPU utilization rate in the host

    sys_cpu_host_utilization_rate    Deprecatedgauge

    CPU utilization rate in the host

    sys_cpu_irq_rate    Deprecatedgauge

    IRQ rate

    sys_cpu_stolen_rate    Deprecatedgauge

    CPU stolen rate

    sys_cpu_sys_rate    Deprecatedgauge

    System CPU utilization rate in the control group

    sys_cpu_throttled_rate    Deprecatedgauge

    Rate at which CPUs were throttled

    sys_cpu_user_rate    Deprecatedgauge

    User space CPU utilization rate in the control group

    sys_cpu_utilization_rate    Deprecatedgauge

    CPU utilization rate in the control group

    sys_disk_queue    gauge

    Current disk queue length of the disk

    sys_disk_queue_depth    gauge

    Maximum disk queue length of the disk

    sys_disk_read_bytes    counter

    Number of bytes read by the disk

    sys_disk_read_time_seconds    counter

    Amount of time that the disk spent reading

    sys_disk_reads    counter

    Number of reads that the disk performed

    sys_disk_time_seconds    counter

    Amount of time that the disk spent performing IO

    sys_disk_write_bytes    counter

    Number of bytes written by the disk

    sys_disk_write_time_seconds    counter

    Amount of time that the disk spent writing

    sys_disk_writes    counter

    Number of writes that the disk performed

    sys_mem_actual_free    gauge / bytes

    Amount of system memory available, including buffers/cache

    sys_mem_actual_used    gauge / bytes

    Amount of system memory used, excluding buffers/cache

    sys_mem_cgroup_actual_used    gauge / bytes

    Amount of system memory used, excluding buffers/cache, in the control group

    sys_mem_cgroup_limit    gauge / bytes

    System memory limit, in the control group

    sys_mem_cgroup_used    gauge / bytes

    Amount of system memory used, including buffers/cache, in the control group

    sys_mem_free    gauge / bytes

    Amount of system memory free, excluding buffers/cache

    sys_mem_limit    Deprecatedgauge / bytes

    System memory limit

    sys_mem_total    gauge / bytes

    Total amount of system memory

    sys_mem_used_sys    gauge / bytes

    Amount of system memory used, including buffers/cache

    sys_pressure_share_time_stalled    gauge

    Percentage of time that tasks were stalled on a given resource

    sys_pressure_total_stall_time_usec    counter / microseconds

    Absolute stall time when tasks were stalled on a given resource

    sys_swap_total    gauge / bytes

    Total amount of swap space

    sys_swap_used    gauge / bytes

    Amount of swap space used

    sysproc_cpu_seconds_total    counter

    Amount of user CPU cycles used, by process

    sysproc_cpu_utilization    Deprecatedgauge

    CPU utilization rate, by process

    sysproc_major_faults_raw    counter

    Number of major page faults, by process

    sysproc_mem_resident    gauge / bytes

    Amount of resident memory used, by process

    sysproc_mem_share    gauge / bytes

    Amount of shared memory used, by process

    sysproc_mem_size    gauge / bytes

    Amount of memory used, by process

    sysproc_minor_faults_raw    gauge

    Number of minor page faults, by process

    sysproc_page_faults_raw    gauge

    Number of page faults, by process

    sysproc_start_time    counter

    OS specific time when process was started