Monitoring Reference

  • Capella AI Services
  • reference
    +
    This reference lists the metric graphs displayed in the Capella AI Services UI Monitoring dashboards.

    The Capella AI Services Monitoring dashboards display a set of metric graphs for unstructured data, structured data, and models, enabling users to monitor AI Services performance in real time.

    For more information about Capella’s Monitoring dashboards, see View Monitoring Dashboards.

    This monitoring reference lists:

    • The Graph Name as displayed in the Capella UI.

    • A Description of what this metric graph entails.

    • The Metric calculation method for this metric.

    These monitoring dashboards offer the following metrics:

    Unstructured Data

    All the metrics shown in the Unstructured Data dashboard.

    Graph Name Description Metric

    Documents in Workflow

    Total number of JSON documents written to KV Store in a workflow.

    sum by (workflowId) (uds_workflow_number_of_documents_loaded{databaseId="<databaseId>",tenantId="<tenantId>"})

    Number of Workflow Operations

    Total number of successful files in workflow operations.

    sum by (workflowId) (uds_workflow_number_of_files_succeeded_in_workflow{databaseId="<databaseId>",tenantId="<tenantId>"})

    Number of Failed Workflow Operations

    Total number of failed files in workflow operations.

    sum by (workflowId) (uds_workflow_number_of_files_failed_in_workflow{databaseId="<databaseId>",tenantId="<tenantId>"})

    Pages in Workflow

    Total number of pages processed in a workflow.

    sum by (workflowId) (uds_workflow_number_of_pages_in_workflow{databaseId="<databaseId>",tenantId="<tenantId>"})

    Pending Files in Workflow

    Total number of pending files to be processed in a workflow.

    sum by (workflowId) (uds_workflow_number_of_files_pending_in_workflow{databaseId="<databaseId>",tenantId="<tenantId>"})

    Total Files in Workflow

    Total number of files in a workflow.

    sum by (workflowId) (uds_workflow_number_of_files_in_workflow{databaseId="<databaseId>",tenantId="<tenantId>"})

    Structured Data

    All the metrics shown in the Structured Data monitoring dashboard.

    Graph Name Description Metric

    Average Embedding Response Latency

    Histogram of embedding service response times for requests made in a workflow.

    avg by (workflowId) (embedding_service_response_duration_seconds{databaseId="<databaseId>",tenantId="<tenantId>"})

    Number of Embeddings Written

    Total number of embedding write attempts to source documents that were successful in a workflow.

    sum by (workflowId) (embedding_writes_total{databaseId="<databaseId>",status="success",tenantId="<tenantId>"})

    Number of Queries Errored Out

    Number of queries processed successfully vs number of queries errored out in a workflow.

    sum by (workflowId) (embedding_service_failures_total{databaseId="<databaseId>",tenantId="<tenantId>",type=~"timeout|4xx|5xx"} + embedding_writes_total{cause=~"timeout|too_big|other",databaseId="<databaseId>",status="failure",tenantId="<tenantId>"})

    Number of Failed Mutations

    Total number of mutations processed unsuccessfully in a workflow.

    sum by (workflowId) (mutations_processed_total{cause=~"too_big|other",databaseId="<databaseId>",tenantId="<tenantId>",type="failure"})

    Number of Requests

    Total number of batch requests sent to the embedding model in a workflow.

    sum by (workflowId) (batch_requests_total{databaseId="<databaseId>",tenantId="<tenantId>",type=~"full|timeout_triggered"})

    Number of Requests per Second

    Total number of batch requests sent to the embedding model per second in a workflow.

    sum by (workflowId) (rate(batch_requests_total{databaseId="<databaseId>",tenantId="<tenantId>",type=~"full|timeout_triggered"}[5m]))

    Workflow Write Success Rate

    Total number of embedding write attempts to source documents that were successful in a workflow.

    avg by (workflowId) ((embedding_writes_total{databaseId="<databaseId>",status="success",tenantId="<tenantId>"} / embedding_writes_total{databaseId="<databaseId>",tenantId="<tenantId>"}) * 100)

    Number of Successful Mutations

    Total number of mutations processed successfully in a workflow.

    sum by (workflowId) (mutations_processed_total{databaseId="<databaseId>",tenantId="<tenantId>",type="success"})

    Number of Tokens Processed

    Total number of tokens consumed in a workflow.

    sum by (workflowId) (tokens_processed_total{databaseId="<databaseId>",tenantId="<tenantId>"})

    Number of Documents

    Total number of documents processed in a workflow.

    sum by (workflowId) (total_docs{databaseId="<databaseId>",tenantId="<tenantId>"})

    Models

    All the metrics shown in the Model monitoring dashboard.

    Graph Name Description Metric

    Cache Hit Rate

    Number of cache hits as a percentage of total cache requests per node.

    sum by (couchbaseNode) ai_model_service_gateway_cache_hits{databaseId="<databaseId>",tenantId="<tenantId>"} / (ai_model_service_gateway_cache_hits{databaseId="<databaseId>",tenantId="<tenantId>"} + ai_model_service_gateway_cache_misses{databaseId="<databaseId>",tenantId="<tenantId>"} * 100)

    Cache Hits

    Number of cache hits per node.

    sum by (couchbaseNode) (ai_model_service_gateway_cache_hits{databaseId="<databaseId>",tenantId="<tenantId>"})

    Cache Misses

    Number of cache misses per node.

    sum by (couchbaseNode) (ai_model_service_gateway_cache_misses{databaseId="<databaseId>",tenantId="<tenantId>"})

    Cache Completion Tokens

    Number of tokens generated from cache (response tokens) per node.

    sum by (couchbaseNode) (ai_model_service_gateway_cached_completion_tokens{databaseId="<databaseId>",tenantId="<tenantId>"})

    CPU Usage

    CPU utilization percentage usage per node.

    sum by (couchbaseNode) (node_cpu_util_rate{databaseId="<databaseId>",tenantId="<tenantId>"} * 100)

    Disk Usage

    Total disk space currently consumed on each node.

    sum by (couchbaseNode) (node_disk_used{databaseId="<databaseId>",tenantId="<tenantId>"})

    Error Rate Trends

    Number of queries that resulted in errors per second per node.

    sum by (couchbaseNode) (rate(ai_model_service_gateway_error_count{databaseId="<databaseId>",tenantId="<tenantId>"}[5m]))

    Error Count

    Number of queries that resulted in errors per node.

    sum by (couchbaseNode) (ai_model_service_gateway_error_count{databaseId="<databaseId>",tenantId="<tenantId>"})

    Guardrail Violations

    Number of guardrail violations detected per node.

    sum by (couchbaseNode) (ai_model_service_gateway_guardrail_violations{databaseId="<databaseId>",tenantId="<tenantId>"})

    Processed Prompt Tokens

    Number of tokens processed (prompt tokens) per node.

    sum by (couchbaseNode) (ai_model_service_gateway_prompt_tokens{databaseId="<databaseId>",tenantId="<tenantId>"})

    API Requests

    Total number of API requests processed per node.

    sum by (couchbaseNode) (ai_model_service_gateway_total_requests_count{databaseId="<databaseId>",tenantId="<tenantId>"})

    Token Generation Rate

    Number of tokens generated (response tokens) per second per node.

    sum by (couchbaseNode) (rate(ai_model_service_gateway_completion_tokens{databaseId="<databaseId>",tenantId="<tenantId>"}[5m]))