Couchbase Cluster Autoscaling
The Autonomous Operator is capable of autoscaling a cluster based on observable metrics. Couchbase autoscaling works in conjunction with Kubernetes
HorizontalPodAutoscaler. You will need to be aware of how metrics are monitored by
HorizontalPodAutoscalerand applied to
CouchbaseAutoscalerto ensure stable operation.
Autoscaling is only supported by Couchbase clusters with stateless server configurations. A server configuration is considered stateless when the following conditions are met:
All buckets are defined as ephemeral buckets using the
At least one group of servers is configured to run only the
The following defines a set of
query services that can be autoscaled:
--- apiVersion: couchbase.com/v2 kind: CouchbaseCluster metadata: name: cb-example spec: servers: - size: 2 name: data services: - data - index - size: 3 name: query services: - query (1) autoscaleEnabled: true (2)
|1||Configurations with only the stateless
All buckets associated with the cluster must be defined as
Autoscaling can be enabled in a preview mode for services other than
query along with buckets other than
true within the
CouchbaseCluster resource to enable preview scaling mode.
--- apiVersion: couchbase.com/v2 kind: CouchbaseCluster metadata: name: cb-example spec: enablePreviewScaling: true (1) servers: - size: 3 name: data services: - data - index autoscaleEnabled: true (2)
|1||Enabling preview scaling mode to allow autoscaling of stateful services|
|2||Autoscaling is now allowed for
Enabling preview autoscaling is unsupported and for experimental purposes only. While preview autoscaling is powerful it should be used with caution as it may increase resource usage at exactly the time Couchbase is under pressure for resources, impacting service levels. Furthermore, Couchbase cannot be stopped while rebalance operations are being performed in an attempt to maintain service levels. Therefore, when enabling preview autoscaling, much consideration should be given to the way an observed metric behaves during cluster topology changes.
The Autonomous Operator will create a
CouchbaseAutoscaler resource for each server configuration with
autoscaleEnabled set to
Once created, the Operator will keep the size of the
CouchbaseAutoscaler resource in sync with the size of its associated server configuration.
The size of a
CouchbaseAutoscaler resource is then adjusted by a Kubernetes Horizontal Pod Autoscaler when target metric values are reached.
Couchbase is then autoscaled as changes to a
HorizontalPodAutoscaler are propagated through the associated
The Operator creates
CouchbaseAutoscale based on the name of its associated server configuration along with the name of the Couchbase cluster.
For example, a server configuration named
query will result in a
CouchbaseAutoscale resource named
As with all Custom resources,
CouchbaseAutoscale resources can be listed with
$ kubectl get couchbaseautoscalers
CouchbaseAutoscaler implements the Kubernetes Scale subresource which allows the
CouchbaseAutoscaler to be used as a target reference for
/scale subresource allows
CouchbaseAutoscaler resources to perform similar resizing operations as native Kubernetes deployments.
Therefore, manual scaling is also possible using
$ kubectl scale --replicas=6 query.cb-example
The above command results in scaling the server configuration named
query to a size of 6.
The Autonomous Operator monitors the value of
CouchbaseAutoscaler.spec.size and applies the value as
spec.servers.size to the associated server configuration.
Couchbase Autoscaling can be performed based on target values of observed metrics. Metrics can be collected from from either the resource metrics API, or the custom metrics API.
The resource metrics API exposes resource metrics such as
memory values from Pods and Nodes.
The resource metrics are provided by the metrics-server and may need to be launched as a cluster add-on.
Run the following command to verify that your cluster is capable of performing autoscaling based on resource metrics:
$ kubectl get --raw /apis/metrics.k8s.io/v1beta1
The response should contain an
APIResourceList with the type of resources that can be fetched.
If you receive a
NotFound error then you will need to install the metrics-server, if you plan on performing autoscaling based on values from the resource metrics API.
Couchbase Metrics can be exposed through the custom metrics API. The custom metrics API exposes 3rd-party metrics to the Kubernetes API server. For Couchbase Autoscaling, this means that Couchbase metrics related to memory quota and query latency can be used as targets to determine when autoscaling should occur. Run the following command to verify that your cluster is capable of performing autoscaling based on custom metrics:
$ kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1
The Couchbase cluster needs to running with
If you receive a
NotFound error then you will need to install a custom metrics service.
The recommended custom metrics service to use with the Operator is the Prometheus Adapter since the metrics are being exported Prometheus.
The Horizontal Pod Autoscaler is a central component to autoscaling as it that observes target metrics and sends sizing requests to the
As mentioned previously, sizing requests from the Horizontal Pod Autoscaler propagate to Couchbase Server configurations.
To Connect a
HorizontalPodAutoscaler resource to a
CouchbaseAutoscaler resource, the
scaleTargetRef value must be set within
kind: HorizontalPodAutoscaler apiVersion: autoscaling/v2beta2 metadata: name: query-hpa spec: scaleTargetRef: apiVersion: couchbase.com/v2 kind: CouchbaseAutoscaler (1) name: query.cb-example (2)
|1||The target ref kind is set to
|2||The name of the
The Horizontal Pod Autoscaler is capable of targeting any metric exposed to the Kubernetes API.
As discussed previously, these metrics can originate from the resource metric server or custom metric server.
When using custom metrics for performing autoscaling based on Couchbase Server metrics, the discovery of available metrics can be performed through Prometheus queries and is beyond the scope of this document.
An alternative method of discovery is to check the couchbase exporter repository for the names of the metrics being exported (remember to include the
cb prefix to name of the metric).
The following example shows how to define target values around the
cbquery_requests_1000ms metric as an objective for performing Autoscale operations:
kind: HorizontalPodAutoscaler apiVersion: autoscaling/v2beta2 metadata: name: query-hpa metrics: - type: Pods pods: metric: name: cbquery_requests_1000ms (1) target: type: AverageValue (2) averageValue: 7 (3)
|1||Targeting Couchbase metric for number of requests which exceed 1000ms.|
|3||Setting 7 queries at 1000ms as an operational baseline.|
In the above example, the autoscaler will take action when queries with a latency of 1000ms exceed 7 (per second). Also, if the number of queries fall below this value then the autoscaler will consider scaling down to reduce overhead. Details about how sizing decisions are made are discussed in the following section.
The Horizontal Pod Autoscaler applies constraints to the sizing values that are allowed to be propagated to the
maxReplicas within the
HorizontalPodAutoscalerSpec sets lower and upper boundaries for autoscaling.
kind: HorizontalPodAutoscaler apiVersion: autoscaling/v2beta2 metadata: name: query-hpa spec: ... minReplicas: 1 maxReplicas: 6
When the Horizontal Pod Autoscaler detects that a metric is relatively above or below a target value, the requested number of replicas will never fall beyond the set boundary for min or max replicas. Refer to the algorithm details of the Horizontal Pod Autoscaler for information about how scaling decisions are determined.
Additional control over autoscaling decisions are available in Kubernetes
v1.18 through the
behavior field of the
There are several supported policies such as stabilization windows along with scale up and down disabling and rate of change control.
For example to prevent scaling down:
behavior: scaleDown: selectPolicy: Disabled
Refer to Kubernetes documentation for additional examples and information related to scaling policies.
The following is a complete example of a
HorizontalPodAutoscaler which incorporates all the ideas of the previous section.
You will need to edit this according to the name of your server configuration and cluster.
--- kind: HorizontalPodAutoscaler apiVersion: autoscaling/v2beta2 metadata: name: query-hpa spec: scaleTargetRef: apiVersion: couchbase.com/v2 kind: CouchbaseAutoscaler name: query.cb-example # autoscale between 1 and 6 replicas minReplicas: 1 maxReplicas: 6 metrics: - type: Pods pods: metric: name: cbquery_requests_1000ms target: type: AverageValue averageValue: 7000m
Kubernetes Cluster Autoscaler provides a means to autoscale the underlying Kubernetes nodes. Cluster Autoscaling is highly recommended for production deployments as it adds an additional dimension of scalability for adding and removing Pods since the underlying physical hardware is being scaled alongside of the Couchbase Cluster. Also, since production deployments tend to schedule Pods with specific resource limits and requests with performance expectations, Cluster Autoscaling allow for 1 to 1 matching of node with the underlying resources without concern that Node is sharing resources with several Pods.
Several cloud offerings such as EKS and AKS offer Cluster Autoscaling with their Kubernetes Offerings. Couchbase Autoscaling will work with Cluster Autoscaling without any additional configuration. As the Horizontal Pod Autoscaler requests additional Couchbase Pods, resource pressure will be applied (or removed) from Kubernetes, and Kubernetes will automatically add (or remove) the number of required physical nodes.
Refer to the following documentation for additional information about Cloud Providers which offer Cluster Autoscaling and how to configure it for your environment.