Monitoring Reference
- Capella AI Services
- reference
This reference lists the metric graphs displayed in the Capella AI Services UI Monitoring dashboards.
The Capella AI Services Monitoring dashboards display a set of metric graphs for unstructured data, structured data, and models, enabling users to monitor AI Services performance in real time.
For more information about Capella’s Monitoring dashboards, see View Monitoring Dashboards.
This monitoring reference lists:
-
The Graph Name as displayed in the Capella UI.
-
A Description of what this metric graph entails.
-
The Metric calculation method for this metric.
These monitoring dashboards offer the following metrics:
Unstructured Data
All the metrics shown in the Unstructured Data dashboard.
| Graph Name | Description | Metric |
|---|---|---|
Documents in Workflow |
Total number of JSON documents written to KV Store in a workflow. |
|
Number of Workflow Operations |
Total number of successful files in workflow operations. |
|
Number of Failed Workflow Operations |
Total number of failed files in workflow operations. |
|
Pages in Workflow |
Total number of pages processed in a workflow. |
|
Pending Files in Workflow |
Total number of pending files to be processed in a workflow. |
|
Total Files in Workflow |
Total number of files in a workflow. |
|
Structured Data
All the metrics shown in the Structured Data monitoring dashboard.
| Graph Name | Description | Metric |
|---|---|---|
Average Embedding Response Latency |
Histogram of embedding service response times for requests made in a workflow. |
|
Number of Embeddings Written |
Total number of embedding write attempts to source documents that were successful in a workflow. |
|
Number of Queries Errored Out |
Number of queries processed successfully vs number of queries errored out in a workflow. |
|
Number of Failed Mutations |
Total number of mutations processed unsuccessfully in a workflow. |
|
Number of Requests |
Total number of batch requests sent to the embedding model in a workflow. |
|
Number of Requests per Second |
Total number of batch requests sent to the embedding model per second in a workflow. |
|
Workflow Write Success Rate |
Total number of embedding write attempts to source documents that were successful in a workflow. |
|
Number of Successful Mutations |
Total number of mutations processed successfully in a workflow. |
|
Number of Tokens Processed |
Total number of tokens consumed in a workflow. |
|
Number of Documents |
Total number of documents processed in a workflow. |
|
Models
All the metrics shown in the Model monitoring dashboard.
| Graph Name | Description | Metric |
|---|---|---|
Cache Hit Rate |
Number of cache hits as a percentage of total cache requests per node. |
|
Cache Hits |
Number of cache hits per node. |
|
Cache Misses |
Number of cache misses per node. |
|
Cache Completion Tokens |
Number of tokens generated from cache (response tokens) per node. |
|
CPU Usage |
CPU utilization percentage usage per node. |
|
Disk Usage |
Total disk space currently consumed on each node. |
|
Error Rate Trends |
Number of queries that resulted in errors per second per node. |
|
Error Count |
Number of queries that resulted in errors per node. |
|
Guardrail Violations |
Number of guardrail violations detected per node. |
|
Processed Prompt Tokens |
Number of tokens processed (prompt tokens) per node. |
|
API Requests |
Total number of API requests processed per node. |
|
Token Generation Rate |
Number of tokens generated (response tokens) per second per node. |
|