Configure Couchbase Cluster Auto-scaling
Configure Couchbase clusters to automatically scale based on observed usage metrics.
Overview
The Kubernetes Operator supports Multi-Dimensional Scaling through independently-configurable server classes, which are manually scalable by default. However, the Kubernetes Operator optionally supports the automatic scaling of Couchbase clusters through an integration with the Horizontal Pod Autoscaler (HPA).
The sections on this page describe how to enable and configure auto-scaling for Couchbase clusters managed by the Kubernetes Operator. For a conceptual description of this feature, please refer to Couchbase Cluster Auto-scaling.
Preparing for Auto-scaling
Metrics play the most important role in Couchbase cluster auto-scaling. Metrics provide the means for the Horizontal Pod Autoscaler to measure cluster performance and respond accordingly when target thresholds are crossed.
Auto-scaling can be configured to use resource metrics or Couchbase metrics. Resource metrics include pod CPU and memory, whereas Couchbase metrics can be stats like bucket memory utilization and query latency. Refer to Couchbase Cluster Auto-scaling Best Practices for additional guidance on how various metrics can be used as a measure of cluster performance.
The Horizontal Pod Autoscaler can only monitor metrics through the Kubernetes API, therefore metrics affecting the Couchbase cluster must be exposed within the Kubernetes cluster before auto-scaling can be configured. Refer to About Exposed Metrics in the concept documentation for more information about how metrics can be exposed for the purposes of auto-scaling.
Enabling Auto-scaling
Enabling auto-scaling for a particular Couchbase cluster starts with modifying the relevant CouchbaseCluster resource.
The required configuration parameters for enabling log forwarding are described in the example below. (The Kubernetes Operator will set the default values for any fields that are not specified by the user.)
CouchbaseCluster Auto-Scaling ParametersapiVersion: couchbase.com/v2
kind: CouchbaseCluster
metadata:
  name: cb-example
spec:
  servers:
  - name: data
    size: 3
    services:
    - data
  - name: index
    autoscaleEnabled: true (1)
    size: 2
    services:
    - index
  - name: query
    autoscaleEnabled: true (2)
    size: 2
    services:
    - query
  autoscaleStabilizationPeriod: 600s (3)| 1 | couchbaseclusters.spec.servers.autoscaleEnabled: Setting this field totruetriggers the Kubernetes Operator to create aCouchbaseAutoscalercustom resource for the relevant server class.
In this example, aCouchbaseAutoscalerresource will be created for theindexserver class.
Refer to About the Couchbase Autoscaler for a conceptual overview of the role theCouchbaseAutoscalercustom resource plays in auto-scaling. | 
| 2 | In this example, a CouchbaseAutoscalerresource will also be created for thequeryserver class. | 
| 3 | couchbaseclusters.spec.autoscaleStabilizationPeriod: This field defines the Couchbase Stabilization Period, which is an internal safety mechanism provided by the Kubernetes Operator that is meant to help prevent over-scaling caused by metrics instability during rebalance.
The value specified in this field determines how longHorizontalPodAutoscalerresources will remain in maintenance mode after the cluster finishes rebalancing.In this example, the stabilization period has been set to  | 
After deploying the CouchbaseCluster resource specification, the Kubernetes Operator will create a CouchbaseAutoscaler resource for each server class configuration that has couchbaseclusters.spec.servers.autoscaleEnabled set to true.
| Enabling auto-scaling for a particular server class configuration does not immediately subject the cluster to being auto-scaled.
The CouchbaseAutoscalerresource simply acts as an endpoint for theHorizontalPodAutoscalerresource to access the pods that are selected for auto-scaling.
TheCouchbaseAutoscalerresource is only activated when referenced by aHorizontalPodAutoscalerresource. | 
Verifying Creation of CouchbaseAutoscaler Resources
The following command can be used to verify that the CouchbaseAutoscaler custom resources exist and match the size of their associated server class configurations:
$ kubectl get couchbaseautoscalersNAME SIZE SERVERS index.cb-example 2 index (1) (2) query.cb-example 2 query
| 1 | NAME: EachCouchbaseAutoscalerresource is named using the format<server-class>.<couchbase-cluster>.
The name is important as it must be referenced when creating theHorizontalPodAutoscalerresource in order link the two resources together. | 
| 2 | SIZE: This is the current number of Couchbase nodes that the Kubernetes Operator is maintaining for theindexserver class.
The Kubernetes Operator keeps the size of aCouchbaseAutoscalerresource in sync with the size of its associated server class configuration. | 
| 
 A  | 
Creating a HorizontalPodAutoscaler Resource
The Kubernetes Operator relies on the Kubernetes Horizontal Pod Autoscaler (HPA) to provide auto-scaling capabilities.
The Horizontal Pod Autoscaler is configured via a HorizontalPodAutoscaler resource, which is the primary interface by which auto-scaling is configured.
Unlike the CouchbaseAutoscaler custom resource created by the Kubernetes Operator, the HorizontalPodAutoscaler resource is created and managed by the user.
The following configuration represents an example to scale the server class from Enabling Auto-scaling.
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: query-hpa
spec:
  scaleTargetRef: (1)
    apiVersion: couchbase.com/v2
    kind: CouchbaseAutoscaler
    name: query.cb-example
  behavior: (2)
    scaleUp:
      policies: (3)
      - type: Pods
        value: 1
        periodSeconds: 15
      stabilizationWindowSeconds: 30 (4)
    scaleDown:
      stabilizationWindowSeconds: 300
  minReplicas: 2 (5)
  maxReplicas: 6
  metrics: (6)
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70| 1 | The spec.scaleTargetRefsection must be configured to reference the relevantCouchbaseAutoscalerresource.
 Refer to Referencing the Couchbase Autoscaler in the concept documentation for more detailed information about these fields. | 
| 2 | Fine-grained scaling behavior is configured via policies specified in the spec.behaviorsection.
Different policies can be specified for scaling up (behavior.scaleUp) and scaling down (behavior.scaleDown).
If no user-supplied values are specified inbehaviorfields, then the default values are used. | 
| 3 | spec.behavior.[].policies: Scaling policies are made up of the following fields:type,value, andperiodSeconds.
The recommended settings aretype: Podsandvalue: 1, while leavingperiodSecondswith the default value.Refer to Scaling Policies and Scaling Increments in the concept documentation for more detailed information about these fields. | 
| 4 | behavior.[].stabilizationWindowSeconds: A stabilization window can be configured as a means to control undesirable scaling caused by fluctuating metrics.
A minimumscaleUpstabilization window of 30 seconds is generally recommended, unless indicated otherwise in Couchbase Cluster Auto-scaling Best Practices. | 
| 5 | The spec.minReplicasandspec.maxReplicasfields set the minimum and maximum number of Couchbase nodes for the associated server class.
 Refer to Sizing Constraints in the concept documentation for more detailed information about these fields. | 
| 6 | The spec.metricssection must target a specific metric, along with an associated threshold for that metric.
In this example, a Kubernetes resource metric (cpu) is being targeted with a threshold set to70percent utilization.Refer to Target Metrics and Thresholds in the concept documentation for more detailed information about targeting metrics in the  | 
The HorizontalPodAutoscaler resource can be created like any other resource by submitting the specifications in a file using kubectl:
$ kubectl apply -f query-hpa.yamlAs soon as the HorizontalPodAutoscaler resource has been successfully created, the Horizontal Pod Autoscaler will begin to monitor the target metric and the cluster will be subject to auto-scaling if the targeted metric is above or below the configured threshold.
Verifying HorizontalPodAutoscaler Status
When the Horizontal Pod Autoscaler begins to monitor the target metric, it will begin reporting the value of metric along with the current vs desired size of the server class.
Run the following command to print these details to the console output:
$ kubectl describe hpa query-hpaMetrics: ( current / target ) resource cpu on pods (as a percentage of request): 1% (50m) / 70% (1) Min replicas: 2 Max replicas: 6 CouchbaseAutoscaler pods: 2 current / 2 desired (2)
| 1 | The currentobserved value of the metric is displayed vs thetargetthreshold. | 
| 2 | The currentsize of the server class is displayed vs thedesiredsize currently being recommended by the Horizontal Pod Autoscaler. | 
Disabling Scale Down
In production environments, it may be desirable to only allow a cluster to automatically scale up while requiring manual intervention to scale down.
This can be accomplished by modifying the HorizontalPodAutoscaler resource as follows:
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: example-hpa
spec:
  behavior:
    scaleDown:
      selectPolicy: Disabled (1)| 1 | spec.behavior.[].selectPolicy: This field controls which policy is chosen by the Horizontal Pod Autoscaler if more than one policy is defined.
When set toDisabledno policy is chosen, and therefore auto-scaling is disabled in that direction. | 
By setting spec.behavior.scaleDown.selectPolicy to Disabled, the Horizontal Pod Autoscaler will never recommend scaling down the associated server class.
Specifying this setting for all HorizontalPodAutoscaler resources associated with a particular Couchbase cluster ensures that the cluster will never automatically scale down.
Clusters that have automatic down-scaling disabled can be manually scaled down by editing the CouchbaseAutoscaler resource directly:
$ kubectl scale --replicas=2 query.cb-exampleThe above command edits the scale subresource and results in the Kubernetes Operator scaling the server class named query to a size of 2.
Disabling Auto-scaling
Auto-scaling, having been enabled and configured for a Couchbase cluster, can subsequently be disabled.
The recommended method for disabling auto-scaling is to set couchbaseclusters.spec.servers.autoscaleEnabled back to false for each of the desired server classes.
apiVersion: couchbase.com/v2
kind: CouchbaseCluster
metadata:
  name: cb-example
spec:
  servers:
  - name: index
    autoscaleEnabled: false (1)
    size: 2
    services:
    - index
  - name: query
    autoscaleEnabled: false
    size: 2
    services:
    - query| 1 | couchbaseclusters.spec.servers.autoscaleEnabled: Setting this field tofalsetriggers the Kubernetes Operator to delete theCouchbaseAutoscalercustom resource that had previously been created for the relevant server class.
In this example, theCouchbaseAutoscalerresource associated with theindexserver class will be deleted by the Kubernetes Operator upon submitting the configuration. | 
Upon deleting the CouchbaseAutoscaler resource, the Kubernetes Operator will no longer reconcile the current size of the server class with the recommendations of the Horizontal Pod Autoscaler, and instead the value specified in couchbaseclusters.spec.servers.size will become the new source of truth.
For example, if the above configuration were to be submitted, it would result in the index and query server classes each being scaled to size: 2 from whatever size they had previously been auto-scaled to.
It’s important to note, however, that the HorizontalPodAutoscaler resource is not managed by the Kubernetes Operator, and therefore does not get deleted along with the CouchbaseAutoscaler resource.
It will continue to exist in the current namespace until it is manually deleted by the user.
Since the HorizontalPodAutoscaler resource can continue to be used if auto-scaling is subsequently re-enabled, it is important to verify the status of the HorizontalPodAutoscaler resource to ensure that it is persisted as expected.
If the desire is to only temporarily disable auto-scaling, the HorizontalPodAutoscaler resource can be left to persist until auto-scaling is eventually re-enabled.
This only works if the names of both the server class and the Couchbase cluster remain the same, because when couchbaseclusters.spec.servers.autoscaleEnabled is set back to true, the Kubernetes Operator will create a CouchbaseAutoscaler resource that is already referenced by the existing HorizontalPodAutoscaler resource.
In this case, the cluster will immediately become subject to the recommendations of the Horizontal Pod Autoscaler.
| Deleting just the  | 
Related Links
- 
Tutorial: Auto-scaling the Couchbase Query Service 
- 
Tutorial: Auto-scaling the Couchbase Data Service 
- 
Tutorial: Auto-scaling the Couchbase Index Service 
- 
Reference: CouchbaseAutoscaler Resource 
- 
Reference: Auto-scaling Lifecycle Events