Prometheus Metrics Reference

    March 23, 2025
    + 12
    This page captures the metrics supplied to Prometheus by the Couchbase Autonomous Operator and links reference pages of a number of additional metrics that are exported by third party libraries.

    Operator Metrics

    Metric

    Type

    Unit

    Labels

    Optional Labels

    Stability

    Added

    backup_jobs_created_total

    Total number of backup jobs that have been created by the operator

    counter

    namespace,backup_type

    cluster_uuid,cluster_name

    committed

    2.8.0

    cpu_under_management

    Total cpu requests for operator managed pods in k8s cpu units

    gauge

    namespace,name

    cluster_uuid,cluster_name

    committed

    2.8.0

    in_place_upgrade_failures

    The number of times in place upgrades have failed

    counter

    name

    cluster_uuid,cluster_name

    committed

    2.7.0

    in_place_upgrades_total

    Total number of in place upgrades performed by operator

    counter

    name

    cluster_uuid,cluster_name

    committed

    2.7.0

    kubernetes_api_request_failures

    Total failed requests to the Kubernetes API by the operator

    counter

    method,host,path

    committed

    2.8.0

    kubernetes_api_requests_time_milliseconds

    Length of time per request to the Kubernetes API

    histogram

    milliseconds

    method,host,path

    committed

    2.8.0

    kubernetes_api_requests_total

    Total requests made to the Kubernetes API by the operator

    counter

    method,host,path

    committed

    2.8.0

    memory_under_management_bytes

    Total memory requests for operator managed pods in bytes

    gauge

    bytes

    namespace,name

    cluster_uuid,cluster_name

    committed

    2.8.0

    pod_readiness_duration

    The time it takes for a pod to enter a ready state

    gauge

    milliseconds

    name,serverClass

    cluster_uuid,cluster_name

    committed

    2.7.0

    pod_recoveries_total

    Total number of times operator has recovered a pod when the pod has been down

    counter

    name,podName

    cluster_uuid,cluster_name

    committed

    2.7.0

    pod_recovery_failures_total

    Total number of times operator has failed to recover a pod

    counter

    name,podName

    cluster_uuid,cluster_name

    committed

    2.7.0

    pod_replacements_failed

    Total number of times pods have failed to be recovered by the operator

    counter

    name

    cluster_uuid,cluster_name

    committed

    2.7.0

    pod_replacements_total

    The amount of times operator has replaced a couchbase server pod due to a change in a couchbase cluster resources

    counter

    name

    cluster_uuid,cluster_name

    committed

    2.7.0

    reconcile_failures

    Total failed reconcile operations performed on a specific cluster

    counter

    namespace,name

    cluster_uuid,cluster_name

    committed

    2.3.0

    reconcile_time_seconds

    Length of time per reconcile for a specific cluster

    histogram

    seconds

    namespace,name

    cluster_uuid,cluster_name

    committed

    2.3.0

    reconcile_total

    Total reconcile operations performed on a specific cluster

    counter

    namespace,name,result

    cluster_uuid,cluster_name

    committed

    2.3.0

    server_http_request_codes_total

    Total HTTP requests to Couchbase Server for a specific cluster, method and status code returned

    counter

    name,method,code,service,host

    name,namespace

    committed

    2.3.0

    server_http_request_failures

    Total failed HTTP requests to Couchbase Server for a specific cluster

    counter

    name,method,service,host

    name,namespace

    committed

    2.3.0

    server_http_requests_time_milliseconds

    Length of time per request for a specific cluster

    histogram

    milliseconds

    name,method,service,host

    name,namespace

    committed

    2.3.0

    server_http_requests_total

    Total HTTP requests to Couchbase Server for a specific cluster

    counter

    name,method,service,host

    name,namespace

    committed

    2.3.0

    swap_rebalance_failures

    Total number of times swap rebalances have failed

    counter

    name

    cluster_uuid,cluster_name

    committed

    2.7.0

    swap_rebalances_total

    Total number of swap rebalances performed by the operator

    counter

    name

    cluster_uuid,cluster_name

    committed

    2.7.0

    upgrade_duration

    The time taken to perform an upgrade

    milliseconds

    name

    cluster_uuid,cluster_name

    committed

    2.7.0

    volume_expansions_total

    Total number of times the size of volumes have been increased under management

    counter

    name,volumeName

    cluster_uuid,cluster_name

    committed

    2.7.0

    volume_size_under_management_bytes

    Total memory claimed by volumes under management by the operator in bytes

    gauge

    bytes

    namespace,name

    cluster_uuid,cluster_name

    committed

    2.8.0

    Additional Metrics