Monitoring Reference

Capella AI Services

reference

This reference lists the metric graphs displayed in the Capella AI Services UI Monitoring dashboards.

The Capella AI Services Monitoring dashboards display a set of metric graphs for unstructured data, structured data, and models, enabling users to monitor AI Services performance in real time.

For more information about Capella’s Monitoring dashboards, see View Monitoring Dashboards.

This monitoring reference lists:

The Graph Name as displayed in the Capella UI.
A Description of what this metric graph entails.
The Metric calculation method for this metric.

These monitoring dashboards offer the following metrics:

Unstructured Data

All the metrics shown in the Unstructured Data dashboard.

Graph Name Description Metric

Graph Name	Description	Metric
Documents in Workflow	Total number of JSON documents written to KV Store in a workflow.	`sum by (workflowId) (uds_workflow_number_of_documents_loaded{databaseId="<databaseId>",tenantId="<tenantId>"})`
Number of Workflow Operations	Total number of successful files in workflow operations.	`sum by (workflowId) (uds_workflow_number_of_files_succeeded_in_workflow{databaseId="<databaseId>",tenantId="<tenantId>"})`
Number of Failed Workflow Operations	Total number of failed files in workflow operations.	`sum by (workflowId) (uds_workflow_number_of_files_failed_in_workflow{databaseId="<databaseId>",tenantId="<tenantId>"})`
Pages in Workflow	Total number of pages processed in a workflow.	`sum by (workflowId) (uds_workflow_number_of_pages_in_workflow{databaseId="<databaseId>",tenantId="<tenantId>"})`
Pending Files in Workflow	Total number of pending files to be processed in a workflow.	`sum by (workflowId) (uds_workflow_number_of_files_pending_in_workflow{databaseId="<databaseId>",tenantId="<tenantId>"})`
Total Files in Workflow	Total number of files in a workflow.	`sum by (workflowId) (uds_workflow_number_of_files_in_workflow{databaseId="<databaseId>",tenantId="<tenantId>"})`

Documents in Workflow

Total number of JSON documents written to KV Store in a workflow.

sum by (workflowId) (uds_workflow_number_of_documents_loaded{databaseId="<databaseId>",tenantId="<tenantId>"})

Number of Workflow Operations

Total number of successful files in workflow operations.

sum by (workflowId) (uds_workflow_number_of_files_succeeded_in_workflow{databaseId="<databaseId>",tenantId="<tenantId>"})

Number of Failed Workflow Operations

Total number of failed files in workflow operations.

sum by (workflowId) (uds_workflow_number_of_files_failed_in_workflow{databaseId="<databaseId>",tenantId="<tenantId>"})

Pages in Workflow

Total number of pages processed in a workflow.

sum by (workflowId) (uds_workflow_number_of_pages_in_workflow{databaseId="<databaseId>",tenantId="<tenantId>"})

Pending Files in Workflow

Total number of pending files to be processed in a workflow.

sum by (workflowId) (uds_workflow_number_of_files_pending_in_workflow{databaseId="<databaseId>",tenantId="<tenantId>"})

Total Files in Workflow

Total number of files in a workflow.

sum by (workflowId) (uds_workflow_number_of_files_in_workflow{databaseId="<databaseId>",tenantId="<tenantId>"})

Structured Data

All the metrics shown in the Structured Data monitoring dashboard.

Graph Name Description Metric

Graph Name	Description	Metric
Average Embedding Response Latency	Histogram of embedding service response times for requests made in a workflow.	`avg by (workflowId) (embedding_service_response_duration_seconds{databaseId="<databaseId>",tenantId="<tenantId>"})`
Number of Embeddings Written	Total number of embedding write attempts to source documents that were successful in a workflow.	`sum by (workflowId) (embedding_writes_total{databaseId="<databaseId>",status="success",tenantId="<tenantId>"})`
Number of Queries Errored Out	Number of queries processed successfully vs number of queries errored out in a workflow.	`sum by (workflowId) (embedding_service_failures_total{databaseId="<databaseId>",tenantId="<tenantId>",type=~"timeout\|4xx\|5xx"} + embedding_writes_total{cause=~"timeout\|too_big\|other",databaseId="<databaseId>",status="failure",tenantId="<tenantId>"})`
Number of Failed Mutations	Total number of mutations processed unsuccessfully in a workflow.	`sum by (workflowId) (mutations_processed_total{cause=~"too_big\|other",databaseId="<databaseId>",tenantId="<tenantId>",type="failure"})`
Number of Requests	Total number of batch requests sent to the embedding model in a workflow.	`sum by (workflowId) (batch_requests_total{databaseId="<databaseId>",tenantId="<tenantId>",type=~"full\|timeout_triggered"})`
Number of Requests per Second	Total number of batch requests sent to the embedding model per second in a workflow.	`sum by (workflowId) (rate(batch_requests_total{databaseId="<databaseId>",tenantId="<tenantId>",type=~"full\|timeout_triggered"}[5m]))`
Workflow Write Success Rate	Total number of embedding write attempts to source documents that were successful in a workflow.	`avg by (workflowId) ((embedding_writes_total{databaseId="<databaseId>",status="success",tenantId="<tenantId>"} / embedding_writes_total{databaseId="<databaseId>",tenantId="<tenantId>"}) * 100)`
Number of Successful Mutations	Total number of mutations processed successfully in a workflow.	`sum by (workflowId) (mutations_processed_total{databaseId="<databaseId>",tenantId="<tenantId>",type="success"})`
Number of Tokens Processed	Total number of tokens consumed in a workflow.	`sum by (workflowId) (tokens_processed_total{databaseId="<databaseId>",tenantId="<tenantId>"})`
Number of Documents	Total number of documents processed in a workflow.	`sum by (workflowId) (total_docs{databaseId="<databaseId>",tenantId="<tenantId>"})`

Average Embedding Response Latency

Histogram of embedding service response times for requests made in a workflow.

avg by (workflowId) (embedding_service_response_duration_seconds{databaseId="<databaseId>",tenantId="<tenantId>"})

Number of Embeddings Written

Total number of embedding write attempts to source documents that were successful in a workflow.

sum by (workflowId) (embedding_writes_total{databaseId="<databaseId>",status="success",tenantId="<tenantId>"})

Number of Queries Errored Out

Number of queries processed successfully vs number of queries errored out in a workflow.

sum by (workflowId) (embedding_service_failures_total{databaseId="<databaseId>",tenantId="<tenantId>",type=~"timeout|4xx|5xx"} + embedding_writes_total{cause=~"timeout|too_big|other",databaseId="<databaseId>",status="failure",tenantId="<tenantId>"})

Number of Failed Mutations

Total number of mutations processed unsuccessfully in a workflow.

sum by (workflowId) (mutations_processed_total{cause=~"too_big|other",databaseId="<databaseId>",tenantId="<tenantId>",type="failure"})

Number of Requests

Total number of batch requests sent to the embedding model in a workflow.

sum by (workflowId) (batch_requests_total{databaseId="<databaseId>",tenantId="<tenantId>",type=~"full|timeout_triggered"})

Number of Requests per Second

Total number of batch requests sent to the embedding model per second in a workflow.

sum by (workflowId) (rate(batch_requests_total{databaseId="<databaseId>",tenantId="<tenantId>",type=~"full|timeout_triggered"}[5m]))

Workflow Write Success Rate

Total number of embedding write attempts to source documents that were successful in a workflow.

avg by (workflowId) ((embedding_writes_total{databaseId="<databaseId>",status="success",tenantId="<tenantId>"} / embedding_writes_total{databaseId="<databaseId>",tenantId="<tenantId>"}) * 100)

Number of Successful Mutations

Total number of mutations processed successfully in a workflow.

sum by (workflowId) (mutations_processed_total{databaseId="<databaseId>",tenantId="<tenantId>",type="success"})

Number of Tokens Processed

Total number of tokens consumed in a workflow.

sum by (workflowId) (tokens_processed_total{databaseId="<databaseId>",tenantId="<tenantId>"})

Number of Documents

Total number of documents processed in a workflow.

sum by (workflowId) (total_docs{databaseId="<databaseId>",tenantId="<tenantId>"})

Models

All the metrics shown in the Model monitoring dashboard.

Graph Name Description Metric

Graph Name	Description	Metric
Cache Hit Rate	Number of cache hits as a percentage of total cache requests per node.	`sum by (couchbaseNode) ai_model_service_gateway_cache_hits{databaseId="<databaseId>",tenantId="<tenantId>"} / (ai_model_service_gateway_cache_hits{databaseId="<databaseId>",tenantId="<tenantId>"} + ai_model_service_gateway_cache_misses{databaseId="<databaseId>",tenantId="<tenantId>"} * 100)`
Cache Hits	Number of cache hits per node.	`sum by (couchbaseNode) (ai_model_service_gateway_cache_hits{databaseId="<databaseId>",tenantId="<tenantId>"})`
Cache Misses	Number of cache misses per node.	`sum by (couchbaseNode) (ai_model_service_gateway_cache_misses{databaseId="<databaseId>",tenantId="<tenantId>"})`
Cache Completion Tokens	Number of tokens generated from cache (response tokens) per node.	`sum by (couchbaseNode) (ai_model_service_gateway_cached_completion_tokens{databaseId="<databaseId>",tenantId="<tenantId>"})`
CPU Usage	CPU utilization percentage usage per node.	`sum by (couchbaseNode) (node_cpu_util_rate{databaseId="<databaseId>",tenantId="<tenantId>"} * 100)`
Disk Usage	Total disk space currently consumed on each node.	`sum by (couchbaseNode) (node_disk_used{databaseId="<databaseId>",tenantId="<tenantId>"})`
Error Rate Trends	Number of queries that resulted in errors per second per node.	`sum by (couchbaseNode) (rate(ai_model_service_gateway_error_count{databaseId="<databaseId>",tenantId="<tenantId>"}[5m]))`
Error Count	Number of queries that resulted in errors per node.	`sum by (couchbaseNode) (ai_model_service_gateway_error_count{databaseId="<databaseId>",tenantId="<tenantId>"})`
Guardrail Violations	Number of guardrail violations detected per node.	`sum by (couchbaseNode) (ai_model_service_gateway_guardrail_violations{databaseId="<databaseId>",tenantId="<tenantId>"})`
Processed Prompt Tokens	Number of tokens processed (prompt tokens) per node.	`sum by (couchbaseNode) (ai_model_service_gateway_prompt_tokens{databaseId="<databaseId>",tenantId="<tenantId>"})`
API Requests	Total number of API requests processed per node.	`sum by (couchbaseNode) (ai_model_service_gateway_total_requests_count{databaseId="<databaseId>",tenantId="<tenantId>"})`
Token Generation Rate	Number of tokens generated (response tokens) per second per node.	`sum by (couchbaseNode) (rate(ai_model_service_gateway_completion_tokens{databaseId="<databaseId>",tenantId="<tenantId>"}[5m]))`

Cache Hit Rate

Number of cache hits as a percentage of total cache requests per node.

sum by (couchbaseNode) ai_model_service_gateway_cache_hits{databaseId="<databaseId>",tenantId="<tenantId>"} / (ai_model_service_gateway_cache_hits{databaseId="<databaseId>",tenantId="<tenantId>"} + ai_model_service_gateway_cache_misses{databaseId="<databaseId>",tenantId="<tenantId>"} * 100)

Cache Hits

Number of cache hits per node.

sum by (couchbaseNode) (ai_model_service_gateway_cache_hits{databaseId="<databaseId>",tenantId="<tenantId>"})

Cache Misses

Number of cache misses per node.

sum by (couchbaseNode) (ai_model_service_gateway_cache_misses{databaseId="<databaseId>",tenantId="<tenantId>"})

Cache Completion Tokens

Number of tokens generated from cache (response tokens) per node.

sum by (couchbaseNode) (ai_model_service_gateway_cached_completion_tokens{databaseId="<databaseId>",tenantId="<tenantId>"})

CPU Usage

CPU utilization percentage usage per node.

sum by (couchbaseNode) (node_cpu_util_rate{databaseId="<databaseId>",tenantId="<tenantId>"} * 100)

Disk Usage

Total disk space currently consumed on each node.

sum by (couchbaseNode) (node_disk_used{databaseId="<databaseId>",tenantId="<tenantId>"})

Error Rate Trends

Number of queries that resulted in errors per second per node.

sum by (couchbaseNode) (rate(ai_model_service_gateway_error_count{databaseId="<databaseId>",tenantId="<tenantId>"}[5m]))

Error Count

Number of queries that resulted in errors per node.

sum by (couchbaseNode) (ai_model_service_gateway_error_count{databaseId="<databaseId>",tenantId="<tenantId>"})

Guardrail Violations

Number of guardrail violations detected per node.

sum by (couchbaseNode) (ai_model_service_gateway_guardrail_violations{databaseId="<databaseId>",tenantId="<tenantId>"})

Processed Prompt Tokens

Number of tokens processed (prompt tokens) per node.

sum by (couchbaseNode) (ai_model_service_gateway_prompt_tokens{databaseId="<databaseId>",tenantId="<tenantId>"})

API Requests

Total number of API requests processed per node.

sum by (couchbaseNode) (ai_model_service_gateway_total_requests_count{databaseId="<databaseId>",tenantId="<tenantId>"})

Token Generation Rate

Number of tokens generated (response tokens) per second per node.

sum by (couchbaseNode) (rate(ai_model_service_gateway_completion_tokens{databaseId="<databaseId>",tenantId="<tenantId>"}[5m]))

For AI agents:

Monitoring Reference

Unstructured Data

Structured Data

Models

See Also