Configure Couchbase Cluster Auto-scaling

Configure Couchbase clusters to automatically scale based on observed usage metrics.

Overview

The Kubernetes Operator supports Multi-Dimensional Scaling through independently-configurable server classes, which are manually scalable by default. However, the Kubernetes Operator optionally supports the automatic scaling of Couchbase clusters through an integration with the Horizontal Pod Autoscaler (HPA).

The sections on this page describe how to enable and configure auto-scaling for Couchbase clusters managed by the Kubernetes Operator. For a conceptual description of this feature, please refer to Couchbase Cluster Auto-scaling.

Preparing for Auto-scaling

Metrics play the most important role in Couchbase cluster auto-scaling. Metrics provide the means for the Horizontal Pod Autoscaler to measure cluster performance and respond accordingly when target thresholds are crossed.

Auto-scaling can be configured to use resource metrics or Couchbase metrics. Resource metrics include pod CPU and memory, whereas Couchbase metrics can be stats like bucket memory utilization and query latency. Refer to Couchbase Cluster Auto-scaling Best Practices for additional guidance on how various metrics can be used as a measure of cluster performance.

The Horizontal Pod Autoscaler can only monitor metrics through the Kubernetes API, therefore metrics affecting the Couchbase cluster must be exposed within the Kubernetes cluster before auto-scaling can be configured. Refer to About Exposed Metrics in the concept documentation for more information about how metrics can be exposed for the purposes of auto-scaling.

Enabling Auto-scaling

Enabling auto-scaling for a particular Couchbase cluster starts with modifying the relevant CouchbaseCluster resource.

The required configuration parameters for enabling log forwarding are described in the example below. (The Kubernetes Operator will set the default values for any fields that are not specified by the user.)

Basic CouchbaseCluster Auto-Scaling Parameters

apiVersion: couchbase.com/v2
kind: CouchbaseCluster
metadata:
  name: cb-example
spec:
  servers:
  - name: data
    size: 3
    services:
    - data
  - name: index
    autoscaleEnabled: true (1)
    size: 2
    services:
    - index
  - name: query
    autoscaleEnabled: true (2)
    size: 2
    services:
    - query
  autoscaleStabilizationPeriod: 600s (3)

1	`couchbaseclusters.spec.servers.autoscaleEnabled`: Setting this field to `true` triggers the Kubernetes Operator to create a `CouchbaseAutoscaler` custom resource for the relevant server class. In this example, a `CouchbaseAutoscaler` resource will be created for the `index` server class. Refer to About the Couchbase Autoscaler for a conceptual overview of the role the `CouchbaseAutoscaler` custom resource plays in auto-scaling.
2	In this example, a `CouchbaseAutoscaler` resource will also be created for the `query` server class.
3	`couchbaseclusters.spec.autoscaleStabilizationPeriod`: This field defines the Couchbase Stabilization Period, which is an internal safety mechanism provided by the Kubernetes Operator that is meant to help prevent over-scaling caused by metrics instability during rebalance. The value specified in this field determines how long `HorizontalPodAutoscaler` resources will remain in maintenance mode after the cluster finishes rebalancing. In this example, the stabilization period has been set to `600s`, which means that the Horizontal Pod Autoscaler will not restart monitoring until 10 minutes after the previous rebalance has completed. Refer to Couchbase Cluster Auto-scaling Best Practices for additional guidance on setting this value in production environments.

After deploying the CouchbaseCluster resource specification, the Kubernetes Operator will create a CouchbaseAutoscaler resource for each server class configuration that has couchbaseclusters.spec.servers.autoscaleEnabled set to true.

Enabling auto-scaling for a particular server class configuration does not immediately subject the cluster to being auto-scaled. The CouchbaseAutoscaler resource simply acts as an endpoint for the HorizontalPodAutoscaler resource to access the pods that are selected for auto-scaling. The CouchbaseAutoscaler resource is only activated when referenced by a HorizontalPodAutoscaler resource.

Verifying Creation of `CouchbaseAutoscaler` Resources

The following command can be used to verify that the CouchbaseAutoscaler custom resources exist and match the size of their associated server class configurations:

$ kubectl get couchbaseautoscalers

NAME                               SIZE   SERVERS
index.cb-example                   2      index (1) (2)
query.cb-example                   2      query

1	`NAME`: Each `CouchbaseAutoscaler` resource is named using the format `<server-class>.<couchbase-cluster>`. The name is important as it must be referenced when creating the `HorizontalPodAutoscaler` resource in order link the two resources together.
2	`SIZE`: This is the current number of Couchbase nodes that the Kubernetes Operator is maintaining for the `index` server class. The Kubernetes Operator keeps the size of a `CouchbaseAutoscaler` resource in sync with the size of its associated server class configuration.

CouchbaseAutoscaler custom resources are fully managed by the Kubernetes Operator and should not be manually created, modified, or deleted by the user. If one is manually deleted, the Kubernetes Operator will re-create it. However, it is possible to edit the CouchbaseAutoscaler (refer to [scale-subresource] below).

A CouchbaseAutoscaler resource only gets deleted by the Kubernetes Operator when auto-scaling is disabled for the associated server class, or if the associated CouchbaseCluster resource is deleted altogether.

Creating a `HorizontalPodAutoscaler` Resource

The Kubernetes Operator relies on the Kubernetes Horizontal Pod Autoscaler (HPA) to provide auto-scaling capabilities. The Horizontal Pod Autoscaler is configured via a HorizontalPodAutoscaler resource, which is the primary interface by which auto-scaling is configured. Unlike the CouchbaseAutoscaler custom resource created by the Kubernetes Operator, the HorizontalPodAutoscaler resource is created and managed by the user.

The following configuration represents an example to scale the server class from Enabling Auto-scaling.

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: query-hpa
spec:
  scaleTargetRef: (1)
    apiVersion: couchbase.com/v2
    kind: CouchbaseAutoscaler
    name: query.cb-example
  behavior: (2)
    scaleUp:
      policies: (3)
      - type: Pods
        value: 1
        periodSeconds: 15
      stabilizationWindowSeconds: 30 (4)
    scaleDown:
      stabilizationWindowSeconds: 300
  minReplicas: 2 (5)
  maxReplicas: 6
  metrics: (6)
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70

1	The `spec.scaleTargetRef` section must be configured to reference the relevant `CouchbaseAutoscaler` resource. `apiVersion`: This field must be set to be set to `couchbase.com/v2`. `kind`: This field must be set to `CouchbaseAutoscaler`. `name`: This field must reference the unique `name` of the `CouchbaseAutoscaler` custom resource. As discussed in the previous section, `CouchbaseAutoscaler` resources are automatically created by the Kubernetes Operator using the name format `<server-class>.<couchbase-cluster>`. Refer to Referencing the Couchbase Autoscaler in the concept documentation for more detailed information about these fields.
2	Fine-grained scaling behavior is configured via policies specified in the `spec.behavior` section. Different policies can be specified for scaling up (`behavior.scaleUp`) and scaling down (`behavior.scaleDown`). If no user-supplied values are specified in `behavior` fields, then the default values are used.
3	`spec.behavior.[].policies`: Scaling policies are made up of the following fields: `type`, `value`, and `periodSeconds`. The recommended settings are `type: Pods` and `value: 1`, while leaving `periodSeconds` with the default value. Refer to Scaling Policies and Scaling Increments in the concept documentation for more detailed information about these fields.
4	`behavior.[].stabilizationWindowSeconds`: A stabilization window can be configured as a means to control undesirable scaling caused by fluctuating metrics. A minimum `scaleUp` stabilization window of 30 seconds is generally recommended, unless indicated otherwise in Couchbase Cluster Auto-scaling Best Practices.
5	The `spec.minReplicas` and `spec.maxReplicas` fields set the minimum and maximum number of Couchbase nodes for the associated server class. `minReplicas` sets the boundary for the number of Couchbase nodes that the associated server class can ever be down-scaled to, and defaults to `1`. This field is important for maintaining service availability. `maxReplicas` sets the upper boundary for the number of Couchbase nodes that the associated server class can ever be up-scaled to, and cannot be set to a value lower than what is defined for `minReplicas`. This field is required, as it provides important protection against runaway scaling events. Refer to Sizing Constraints in the concept documentation for more detailed information about these fields.
6	The `spec.metrics` section must target a specific metric, along with an associated threshold for that metric. In this example, a Kubernetes resource metric (`cpu`) is being targeted with a threshold set to `70` percent utilization. Refer to Target Metrics and Thresholds in the concept documentation for more detailed information about targeting metrics in the `HorizontalPodAutoscaler` resource.

The HorizontalPodAutoscaler resource can be created like any other resource by submitting the specifications in a file using kubectl:

$ kubectl apply -f query-hpa.yaml

As soon as the HorizontalPodAutoscaler resource has been successfully created, the Horizontal Pod Autoscaler will begin to monitor the target metric and the cluster will be subject to auto-scaling if the targeted metric is above or below the configured threshold.

Verifying `HorizontalPodAutoscaler` Status

When the Horizontal Pod Autoscaler begins to monitor the target metric, it will begin reporting the value of metric along with the current vs desired size of the server class.

Run the following command to print these details to the console output:

$ kubectl describe hpa query-hpa

Metrics:                                               ( current / target )
  resource cpu on pods  (as a percentage of request):  1% (50m) / 70% (1)
Min replicas:                         2
Max replicas:                         6
CouchbaseAutoscaler pods:             2 current / 2 desired  (2)

1	The `current` observed value of the metric is displayed vs the `target` threshold.
2	The `current` size of the server class is displayed vs the `desired` size currently being recommended by the Horizontal Pod Autoscaler.

Disabling Scale Down

In production environments, it may be desirable to only allow a cluster to automatically scale up while requiring manual intervention to scale down. This can be accomplished by modifying the HorizontalPodAutoscaler resource as follows:

apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: example-hpa
spec:
  behavior:
    scaleDown:
      selectPolicy: Disabled (1)

1	`spec.behavior.[].selectPolicy`: This field controls which policy is chosen by the Horizontal Pod Autoscaler if more than one policy is defined. When set to `Disabled` no policy is chosen, and therefore auto-scaling is disabled in that direction.

By setting spec.behavior.scaleDown.selectPolicy to Disabled, the Horizontal Pod Autoscaler will never recommend scaling down the associated server class. Specifying this setting for all HorizontalPodAutoscaler resources associated with a particular Couchbase cluster ensures that the cluster will never automatically scale down.

Clusters that have automatic down-scaling disabled can be manually scaled down by editing the CouchbaseAutoscaler resource directly:

$ kubectl scale --replicas=2 query.cb-example

The above command edits the scale subresource and results in the Kubernetes Operator scaling the server class named query to a size of 2.

Disabling Auto-scaling

Auto-scaling, having been enabled and configured for a Couchbase cluster, can subsequently be disabled.

The recommended method for disabling auto-scaling is to set couchbaseclusters.spec.servers.autoscaleEnabled back to false for each of the desired server classes.

apiVersion: couchbase.com/v2
kind: CouchbaseCluster
metadata:
  name: cb-example
spec:
  servers:
  - name: index
    autoscaleEnabled: false (1)
    size: 2
    services:
    - index
  - name: query
    autoscaleEnabled: false
    size: 2
    services:
    - query

1 couchbaseclusters.spec.servers.autoscaleEnabled: Setting this field to false triggers the Kubernetes Operator to delete the CouchbaseAutoscaler custom resource that had previously been created for the relevant server class. In this example, the CouchbaseAutoscaler resource associated with the index server class will be deleted by the Kubernetes Operator upon submitting the configuration.

Upon deleting the CouchbaseAutoscaler resource, the Kubernetes Operator will no longer reconcile the current size of the server class with the recommendations of the Horizontal Pod Autoscaler, and instead the value specified in couchbaseclusters.spec.servers.size will become the new source of truth. For example, if the above configuration were to be submitted, it would result in the index and query server classes each being scaled to size: 2 from whatever size they had previously been auto-scaled to.

It’s important to note, however, that the HorizontalPodAutoscaler resource is not managed by the Kubernetes Operator, and therefore does not get deleted along with the CouchbaseAutoscaler resource. It will continue to exist in the current namespace until it is manually deleted by the user. Since the HorizontalPodAutoscaler resource can continue to be used if auto-scaling is subsequently re-enabled, it is important to verify the status of the HorizontalPodAutoscaler resource to ensure that it is persisted as expected.

If the desire is to only temporarily disable auto-scaling, the HorizontalPodAutoscaler resource can be left to persist until auto-scaling is eventually re-enabled. This only works if the names of both the server class and the Couchbase cluster remain the same, because when couchbaseclusters.spec.servers.autoscaleEnabled is set back to true, the Kubernetes Operator will create a CouchbaseAutoscaler resource that is already referenced by the existing HorizontalPodAutoscaler resource. In this case, the cluster will immediately become subject to the recommendations of the Horizontal Pod Autoscaler.

Deleting just the HorizontalPodAutoscaler resource will also have the effect of "disabling" auto-scaling. In this scenario, the Kubernetes Operator continues to maintain the CouchbaseAutoscaler resource, but it will remain at the same size that was last recommended by the Horizontal Pod Autoscaler before it was deleted.

Learn: Couchbase Cluster Auto-scaling
Learn: Couchbase Cluster Auto-scaling Best Practices
Tutorial: Auto-scaling the Couchbase Query Service
Tutorial: Auto-scaling the Couchbase Data Service
Tutorial: Auto-scaling the Couchbase Index Service
Reference: CouchbaseAutoscaler Resource
Reference: Auto-scaling Lifecycle Events

Configure Couchbase Cluster Auto-scaling

Overview

Preparing for Auto-scaling

Enabling Auto-scaling

Verifying Creation of CouchbaseAutoscaler Resources

Creating a HorizontalPodAutoscaler Resource

Verifying HorizontalPodAutoscaler Status

Disabling Scale Down

Disabling Auto-scaling

Related Links

Verifying Creation of `CouchbaseAutoscaler` Resources

Creating a `HorizontalPodAutoscaler` Resource

Verifying `HorizontalPodAutoscaler` Status