Alert Reference

  • reference
    +
    This reference lists the alerts that Capella can emit, the conditions in which they occur, and a description for each.

    Metric-Based Alerts

    For alerts caused by measurable changes to Capella resource use, notification messages include information about potential causes and remedial actions to investigate.

    Capella delivers notifications for these alerts to you by:

    Display Name Resource Conditions Description Related Documentation

    High CPU Usage Warning

    Database

    Critical: during a one-minute interval, the five minute average CPU usage of one or more database nodes exceeded 90%.

    Warning: during a five-minute interval, the five-minute average CPU usage of one or more database nodes exceeded 85%.

    High CPU usage events can impact the throughput of your database. This issue could be due to recent changes in the downstream application or dataset, such as changes to the data type sent, the amount of data sent, or natural data/transaction growth. Consider scaling your service nodes to address the issue. If new queries were recently introduced, validate and add the required indexes.

    Low Node Disk Storage

    Database

    Critical: disk usage is more than 90 % for the last 5 minutes.

    Warning: disk usage is more than 80% for the last 5 minutes.

    This issue could be due to spikes in data usage or natural data growth. Consider expanding your database storage immediately to resolve the issue. Inaction could result in service disruption.

    Runaway Disk Queue

    Database

    Critical: the disk queue has reached over 800,000 requests during a five minute period.

    Warning: the disk queue has reached over 500,000 requests during a five minute period.

    A bucket is experiencing a runaway disk queue. This is when data is added to the write queue faster than the node can write to the bucket. This issue can be caused by a sudden spike of incoming transactions or an undersized database configuration that cannot keep up with its workload. Consider validating incoming data before scaling node capacity.

    Bucket Hard Out of Memory

    Database

    Critical: there has been one or more out of memory errors in the past five minutes.

    A bucket exceeded its available memory and requires immediate attention. This issue can be caused by changes to the incoming data, undersized service nodes, undersized service quotas, or a long time to live (TTL) setting on documents. Consider immediately adding additional memory or nodes to resolve the issue.

    App Service High Data Sync Errors Warning

    App Services

    Warning: In a 5-minute interval, more than 10 documents were rejected by the App Endpoint’s sync function.

    These documents will not be accessible via the App Endpoint. If this is unexpected, troubleshoot the Sync Function or contact Customer Support with details of the intended use of the Sync Function to help troubleshoot the errors. This only corresponds to sync function rejections and not sync function exceptions. Rejections are logged in Sync Gateway logs at info and debug levels.

    App Service High Import Errors Warning

    App Services

    Warning: In a 5-minute interval, more than 10 documents failed to import due to error.

    These documents will not be accessible through the App Endpoint. The documents may have been rejected by the App Endpoint’s Sync Function or encountered an error processing the document. There may be an error in the Sync Function’s logic, by a writer error or timeout. This alert is not caused by a CAS failure, a canceled import, or an already imported document. Import errors are logged in the Sync Gateway at the info level.

    App Service High Access Errors Warning

    App Services

    Warning: In the last 5 minutes, more than 50 requests made to the App Endpoint failed to successfully authenticate.

    A high volume of unsuccessful authentications may indicate malicious clients attempting to access the system. It can be caused by:

    • Failed OIDC auth

    • A failure to update an OIDC user after signing in

    • A failure to get a user due to an internal error

    • Failed basic auth

    • An invalid session, such as an expired or bad cookie

    • No login provided when guest access is disabled

    • Trying to authenticate with a disabled user

    App Service High CPU Usage

    App Services

    Warning: During a 5-minute interval, the 5-minute average CPU usage of one or more App Service nodes exceeded 90%.

    Critical: During a 1-minute interval, the 5-minute average CPU usage of one or more App Service nodes exceeded 95%.

    High CPU usage events can affect the throughput and latency of App Endpoints. This issue could be due to changes in the downstream application or dataset, such as changes to the number of requests or connections, the amount of data sent, or natural data/request growth. It may also be related to changes to the endpoint’s access control function or dataset. If these changes are expected, you may consider scaling your App Service deployment to address the issue.

    App Service High Memory Usage

    App Services

    Warning: During a 5-minute interval, the 5-minute average memory usage of one or more app services exceeded 85%.

    Critical: During a 1-minute interval, the 5-minute average memory usage of one or more app services exceeded 90%.

    High memory usage events can impact the throughput of your service. This issue could be due to recent changes in the downstream application or dataset, such as changes to the data type sent, the amount of data sent, or natural data/transaction growth. Consider scaling your service nodes to address the issue.

    Operational Alerts

    When an alert results from operational disruption, Capella proactively notifies Couchbase Support.

    Capella delivers notifications for these alerts to you by:

    Display Name Resource Conditions Description Related Documentation

    Backup Failed

    Database

    Warning: a database backup did not complete.

    A backup for a bucket in this database failed to complete. Retry the backup. If you continue to experience issues, please contact Capella Support.

    Backup Deletion Failed

    Database

    Warning: a database backup deletion did not complete.

    A backup for a bucket in this database has failed to delete. Retry the backup deletion. If you continue to experience issues, please contact Capella Support.

    Backup Create Download Failed

    Database

    Warning: the creation of a downloadable backup file has failed.

    The process to create a downloadable backup file from a backup cycle has failed for a bucket in this database. Please retry the operation. If you continue to experience issues, please contact Couchbase Capella Support.

    Restore Failed

    Database

    Warning: a bucket restoration operation did not complete.

    The process to restore data from a backup has failed for a bucket in this database. Some contents from the backup may have been successfully restored. Please retry the operation. If you continue to experience issues, please contact Couchbase Capella Support.

    Database Deployment Failed

    Database

    Warning: a database deployment did not complete.

    A database failed to deploy. Databases that fail deployment cannot guarantee service functionality. This issue could be due to an underlying service or limit issue. Retry the deployment or contact Capella Support for assistence.

    Database Peering Failed

    Database

    Warning: a database peering operation did not complete.

    The process to peer a database has failed. Please contact Couchbase Capella Support for assistance.

    App Services Cert Expiration

    App Services

    Warning: The Public CA-signed certificate for one or more App Services will soon expire.

    The certs will automatically be updated on expiration. If you have pinned the cert within your application, you should plan to upgrade your application with the new certificate to avoid an outage. Contact support if you would prefer to upgrade your App Service certificate ahead of the expiration date. Otherwise, no action is required from you.

    App Services Cert Expired

    App Services

    Warning: The Public CA-signed certificate has been updated for one or more App Services.

    If you have pinned the certificate within your application, please download the updated certificate and update your application. Otherwise, no action is required.