Monitoring Reference
- Capella Operational
- reference
This reference lists the metric graphs displayed in the Capella UI Monitoring dashboards.
In the Capella UI, the Monitoring dashboards display a set of metric graphs, enabling users to monitor system performance in real time.
For more information about Capella’s monitoring dashboards, see View Monitoring Dashboards. For more information about App Service’s monitoring dashboards, see Monitor through the UI.
This monitoring reference lists:
-
The Graph Name as displayed in the Capella UI.
-
A Description of what this metric graph entails.
-
The Metric calculation method for this metric. For more information about the metrics used, see Metrics Reference.
The monitoring dashboards show the following metrics:
| Graph Name | Description | Metric |
|---|---|---|
App Endpoint: Total Auth Failures |
Total Auth Failures is the total number of authentication failures per app endpoint. |
|
App Endpoint: Total Auth Successes |
Total Auth Successes is the total number of successful authentication per app endpoint. |
|
App Endpoint: Total Requested Deltas |
Total Requested Deltas is the total number of deltas requested per app endpoint. |
|
App Endpoint: Total Deltas Sent |
Total Deltas Sent is the total number of deltas sent per app endpoint. |
|
App Endpoint: Total Documents Imported |
Total Documents Imported is the total number of documents imported per app endpoint. |
|
App Endpoint: Total Documents Read |
Total Documents Read is the total number of documents read per app endpoint. |
|
App Endpoint: Total Documents Rejected |
Total Documents Rejected is the total number of documents rejected per app endpoint. |
|
App Endpoint: Total Documents Written |
Total Documents Written is the total number of documents written per app endpoint. |
|
App Endpoint: Active Pull Only Replications |
Active Pull Only Replications is the total number of active pull only replication operations performed per app endpoint. |
|
App Service: Bytes Received by Node |
Total bytes received on the primary network interface by node. |
|
App Service: Bytes Sent by Node |
Total bytes sent on the primary network interface by node. |
|
App Service: CPU Utilization by Node |
CPU utilization percentage of the Sync Gateway process by node. |
|
App Service: Memory Utilization by Node |
Memory utilization percentage of the Sync Gateway node. |
|
Data: Current Active Items by Bucket |
Number of current active items by bucket. |
|
Data: Disk Reads per Second by Bucket |
Average disk reads per second by bucket. |
|
Data: Disk Used by Bucket |
Total disk space consumed by each bucket. |
|
Data: Total Disk Write Queue Size by Bucket |
Total disk write queue size by bucket. |
|
Data: GSI Items Remaining to Index |
A count of items to be indexed using Global Secondary Indexes. |
|
Data: Bucket GET Ops per Second |
Bucket GET Ops per Second is the average number of GET operations per second over the last 5 minutes by bucket. |
|
Data: Bucket Ops per Second |
Bucket Ops per Second is the average number of operations per second over the last 5 minutes by bucket. |
|
Data: Bucket SET Ops per Second |
Bucket SET Ops per Second is the average number of SET operations per second over the last 5 minutes by bucket. |
|
Data: Quota Memory Used Percent By Bucket/Node |
Quota memory usage percent by bucket and node. |
|
Data: Memory Used by Bucket |
Memory usage per bucket. |
|
Data: Out Of Memory (OOM) Errors by Bucket |
Out of Memory Errors (OOM) by bucket. |
|
Data: Temporary Out Of Memory Errors by Bucket |
A count of Temporary Out of Memory Errors by Bucket. |
|
Data: vBuckets (Active) |
Count of active vBuckets used to distribute data across nodes. |
|
Data: Active Item Resident Ratio by Bucket |
Ratio of unique items in memory compared to on disk, per bucket. |
|
Data: vBuckets (Replica) |
Count of replicate vBuckets used to distribute data across nodes. |
|
Data: Replica Item Resident Ratio by Bucket |
Ratio of replica items in memory compared to on disk, per bucket. |
|
Analytics: Storage Used |
The total size of remote s3 storage used. |
|
Analytics: Total Requests per Second |
Total number of received analytics requests over the last 5 minutes for the entire cluster. |
|
Data: Connections |
Count of connections to the cluster. |
|
Data: Cluster GET Ops per Second |
Cluster GET Ops per Second is the average total number of GET operations per second over the last 5 minutes across all buckets. |
|
Data: Cluster Ops per Second |
Cluster Ops per Second is the average total number of operations per second over the last 5 minutes across all buckets. |
|
Data: Cluster SET Ops per Second |
Cluster SET Ops per Second is the average total number of SET operations per second over the last 5 minutes across all buckets. |
|
Data: Cluster Total Memory Used |
Total memory used by the cluster. |
|
Query: Total Requests per Second |
The per second rate of SQL++ query requests over the last 5 minutes for the entire cluster. |
|
Data: Data Encryption Status by Node |
Provides status information about encryption-at-rest functionality enabling monitoring of encryption coverage and operational health. |
|
Data: Data Encryption Keys In Use by Node |
Counts active Data Encryption Keys currently being utilized providing insight into encryption key distribution and resource consumption. |
|
Columnar: Connections |
Count of connections to the cluster. |
|
Columnar: CPU Utilization by Node |
Real-time CPU utilization percentage per Columnar node. |
|
Node: Disk Read IOPS by Node |
Disk read Input/Output Operations Per Second for each node providing insight into storage read activity patterns. |
|
Node: Disk Total IOPS by Node |
Combined disk Input/Output Operations Per Second (read + write) providing comprehensive view of total storage I/O activity. |
|
Node: Disk Write IOPS by Node |
Disk write Input/Output Operations Per Second for each node measuring storage write activity intensity. |
|
Node: Disk Read Throughput by Node |
Rate of data read from disk storage by each node measured in bytes per second for bandwidth utilization monitoring. |
|
Node: Disk Total Throughput by Node |
Combined disk throughput (read + write) providing comprehensive view of total storage bandwidth utilization. |
|
Node: Disk Write Throughput by Node |
Rate of data written to disk storage by each node including document persistence and compaction operations. |
|
Query: Query Request Time Average in Milliseconds |
Total end-to-end time to process all queries. |
|
Query: Query Execution Time Average in Milliseconds |
Time to execute all queries. |
|
Data: Active Items by Node |
Number of active items (documents) stored on each node. |
|
Analytics: Parse Failure Rate by Link/Bucket |
The per second rate of record parsing failures from linked items, averaged over the last 5 minutes by link and bucket. |
|
Analytics: Total Parse Failures by Link/Bucket |
The total number of record parsing failures from linked items, averaged over the last 5 minutes by link and bucket. |
|
Analytics: Ingested Bytes Rate by Link/Bucket |
The per second rate of incoming bytes ingested by analytics, averaged over the last 5 minutes by link and bucket. |
|
Analytics: Total Ingested Bytes by Link/Bucket |
The total incoming bytes ingested by analytics by link and bucket. |
|
Analytics: Linked Ops Rate by Link/Bucket |
The per second rate of linked record operations processed by analytics, averaged over the last 5 minutes by link and bucket. |
|
Analytics: Total Linked Ops by Link/Bucket |
The total number of linked record operations processed by analytics by link and bucket. |
|
Analytics: Read Rate by Node |
The per second rate at which disk bytes are read for the Analytics Service averaged over the last 5 minutes by node. |
|
Analytics: Write Rate by Node |
The per second rate at which disk bytes are written for the Analytics Service averaged over the last 5 minutes by node. |
|
Analytics: Total Requests per Second by Node |
Total number of received analytics requests over the last 5 minutes by node. |
|
Analytics: System Load per Node |
The Analytics System load average for the last minute by node. |
|
Data: Data Encryption Key Fetch Failures by Node |
Percentage of failed attempts to retrieve Data Encryption Keys from the key management service due to network or authentication issues. |
|
Data: Data Encryption Key Fetch Frequency by Node |
Measures how frequently Data Encryption Keys are retrieved from the key management service reflecting encryption activity levels. |
|
Data: Data Encryption Key Rotation Failures by Node |
Tracks encryption key rotation process failures per node, which can pose security risks and compliance violations. |
|
Data: Data Encryption Key Rotation Frequency by Node |
Tracks how frequently encryption keys are rotated per node, providing insight into security policy compliance and key management activity. |
|
Data: Data Encryption Service Failures by Node |
Measures failures within the data encryption service infrastructure that can impact data protection capabilities and compliance. |
|
Node: CPU Utilization by Node |
Real-time CPU utilization percentage per node. |
|
Node: Disk Used Percent by Node |
The used percentage of each node's disk space. |
|
Data: Disk Used Bytes by Node |
Total disk space currently consumed on each node. |
|
Index: Average Item Size |
Average size of the indexed keys by node. |
|
Index: Cache Hits per Second |
The per second rate of cache hits averaged over the last 5 minutes by node. |
|
Index: Cache Misses per Second |
The per second rate of cache misses averaged over the last 5 minutes by node. |
|
Index: Indexer Codebook Memory Usage by Node |
Memory Usage of Vector Index Codebook by Node. |
|
Index: Indexer Codebook Train Duration By Node |
Training Duration of Vector Index Codebook by Node in Seconds. |
|
Index: Process CPU (System) Usage by Node |
The system-space process CPU utilization of the Index service by node. |
|
Index: Process CPU Total Usage by Node |
The total (user and system) process CPU utilization of the Index service by node. |
|
Index: Process CPU (User) Usage by Node |
The user-space process CPU utilization of the Index service by node. |
|
Index: Indexable Data Size |
The size of indexable data that is maintained by the indexer by node. |
|
Index: Disk Size |
Total disk file size consumed by all indexes by node. |
|
Index: Documents Indexed per Second |
The per second rate of documents indexed averaged over the last 5 minutes by node. |
|
Index: Documents Pending |
Number of documents pending to be indexed by node. |
|
Index: Documents Queued |
Number of documents queued to be indexed by node. |
|
Index: Item Count |
The number of items currently indexed by node. |
|
Index: Total Process Memory Usage by Node |
The total process memory usage of the Index service by node. |
|
Index: Diverging Replica Indexes |
Number of Diverging Replica Indexes. |
|
Index: Requests per Second |
The per second rate of requests to the indexer averaged over the last 5 minutes by node. |
|
Index: Resident Percent |
Percentage of the data held in memory by the indexer by node. |
|
Data: Process CPU (System) Usage by Node |
The system-space process CPU utilization of the Data service by node. |
|
Data: Process CPU Total Usage by Node |
The total (user and system) process CPU utilization of the Data service by node. |
|
Data: Process CPU (User) Usage by Node |
The user-space process CPU utilization of the Data service by node. |
|
Data: Total Process Memory Usage by Node |
The total process memory usage of the Data service by node. |
|
Node: Memory Used by Node |
Memory usage per node. |
|
Query: Process CPU (System) Usage by Node |
The system-space process CPU utilization of the Query service by node. |
|
Query: Process CPU Total Usage by Node |
The total (user and system) process CPU utilization of the Query service by node. |
|
Query: Process CPU (User) Usage by Node |
The user-space process CPU utilization of the Query service by node. |
|
Query: Total Process Memory Usage by Node |
The total process memory usage of the Query service by node. |
|
Query: Requests per Second by Node |
The per second rate of N1QL query requests over the last 5 minutes by node. |
|
Data: Replica Items by Node |
Number of replica items (documents) stored on each node. |
|
Data: SDK Data Service Mutation Durable Duration Count by Node |
Counts durable Key-Value mutations that include full persistence and replication guarantees for data safety. |
|
Data: SDK Data Service Mutation Durable Duration Sum by Node |
Accumulates total time spent on durable Key-Value mutations including persistence and replication overhead. |
|
Data: SDK Data Service Mutation NonDurable Duration Count by Node |
Counts non-durable Key-Value mutation operations that prioritize performance by completing upon memory persistence. |
|
Data: SDK Data Service Mutation NonDurable Duration Sum by Node |
Accumulates total time spent on non-durable Key-Value mutations that prioritize performance over durability guarantees. |
|
Data: SDK Data Service Cancelled Requests by Node |
Key-Value operations cancelled before completion due to client-side request cancellation or application shutdown. |
|
Data: SDK KV Service Timed Out Requests by Node |
Key-Value operations that exceeded their timeout threshold before completion due to network latency or server overload. |
|
Data: SDK Data Service Total Requests by Node |
Comprehensive count of all Key-Value operations including GET, SET, DELETE and other CRUD operations. |
|
Data: SDK Data Service Retrieval Duration Count by Node |
Counts total number of Key-Value retrieval operations completed providing data for throughput and performance analysis. |
|
Data: SDK Data Service Retrieval Duration Sum by Node |
Accumulates total time spent on all Key-Value retrieval operations enabling calculation of average retrieval times. |
|
Query: SDK Query Service Duration Count by Node |
Counts total number of SQL++ query operations completed providing data for calculating average query durations. |
|
Query: SDK Query Service Duration Sum by Node |
Accumulates total time spent executing all SQL++ queries enabling calculation of average query durations. |
|
Query: SDK Query Service Cancelled Requests by Node |
SQL++ query operations cancelled before completion due to client-side cancellation or application timeouts. |
|
Query: SDK Query Service Timed Out Requests by Node |
SQL++ query operations that exceeded their timeout threshold due to complexity, resource constraints, or network issues. |
|
Query: SDK Query Service Total Requests by Node |
Total number of SQL++ query operations executed through the SDK encompassing all query types. |
|
Search: SDK Search Service Duration Count by Node |
Counts total number of Full Text Search operations completed providing data for calculating average search durations. |
|
Search: SDK Search Service Duration Sum by Node |
Accumulates total time spent executing all Full Text Search operations enabling calculation of average search durations. |
|
Search: SDK Search Service Cancelled Requests by Node |
Full Text Search operations cancelled before completion due to client-side request cancellation or timeout handling. |
|
Search: SDK Search Service Timed Out Requests by Node |
Full Text Search operations that exceeded their timeout threshold due to complex queries or resource constraints. |
|
Search: SDK Search Service Total Requests by Node |
Total number of Full Text Search operations performed through the SDK including all search query types. |
|
XDCR: Process CPU (System) Usage by Node |
The system-space process CPU utilization of the XDCR service by node. |
|
XDCR: Process CPU Total Usage by Node |
The total (user and system) process CPU utilization of the XDCR service by node. |
|
XDCR: Process CPU (User) Usage by Node |
The user-space process CPU utilization of the XDCR service by node. |
|
XDCR: Total Process Memory Usage by Node |
The total process memory usage of the XDCR service by node. |
|
Data: SDK Data Service Mutation Durable Duration by Upper Bound of Bucket(le) |
Histogram of durable Key-Value mutation durations that ensure data persistence and replication before completion. |
|
Data: SDK Data Service Mutation NonDurable Duration by Upper Bound of Bucket(le) |
Histogram of non-durable Key-Value mutation durations that prioritize performance by completing when data reaches memory. |
|
Data: SDK Data Service Retrieval Duration by Upper Bound of Bucket(le) |
Histogram of Key-Value retrieval operation durations organized by latency buckets for performance analysis. |
|
Query: SDK Query Service Duration by Upper Bound of Bucket(le) |
Histogram of SQL++ query operation durations organized by latency buckets for performance analysis. |
|
Search: SDK Search Service Duration by Upper Bound of Bucket(le) |
Histogram of Full Text Search operation durations organized by latency buckets for performance analysis. |
|
Data: Compute Units by Bucket |
Number of compute units by bucket. |
|
Data: Current Active Items by Bucket |
Number of current active items by bucket. |
|
Data: Disk Used by Bucket |
Total disk storage consumed by each serverless bucket. |
|
Data: Bucket GET Ops per Second |
Bucket GET Ops per Second is the average number of GET operations per second over the last 5 minutes by bucket. |
|
Data: Bucket SET Ops per Second |
Bucket SET Ops per Second is the average number of SET operations per second over the last 5 minutes by bucket. |
|
Data: Read Units by Bucket |
Number of units read by bucket. |
|
Data: Write Units by Bucket |
Number of units written by bucket. |
|
XDCR: xdcr_resp_wait_time_seconds (avg) |
The rolling average amount of time it takes from when a MemcachedRequest is created to be ready to route to an outnozzle to the time that the response has been heard back from the target node after a successful write. |
|
XDCR: xdcr_resp_wait_time_seconds (max) |
The rolling average amount of time it takes from when a MemcachedRequest is created to be ready to route to an outnozzle to the time that the response has been heard back from the target node after a successful write. |
|
XDCR: xdcr_wtavg_docs_latency_seconds |
The rolling average amount of time it takes for the source cluster to receive the acknowledgement of a SET_WITH_META response after the Memcached request has been composed to be processed by the XDCR Target Nozzle. |
|