A newer version of this documentation is available.

View Latest

Configure Couchbase Cluster Auto-scaling

      +
      Configure Couchbase clusters to automatically scale based on observed usage metrics.

      Overview

      The Autonomous Operator supports Multi-Dimensional Scaling through independently-configurable server classes, which are manually scalable by default. However, the Autonomous Operator optionally supports the automatic scaling of Couchbase clusters through an integration with the Horizontal Pod Autoscaler (HPA).

      The sections on this page describe how to enable and configure auto-scaling for Couchbase clusters managed by the Autonomous Operator. For a conceptual description of this feature, please refer to Couchbase Cluster Auto-scaling.

      Preparing for Auto-scaling

      Metrics play the most important role in Couchbase cluster auto-scaling. Metrics provide the means for the Horizontal Pod Autoscaler to measure cluster performance and respond accordingly when target thresholds are crossed.

      Auto-scaling can be configured to use resource metrics or Couchbase metrics. Resource metrics include pod CPU and memory, whereas Couchbase metrics can be stats like bucket memory utilization and query latency. Refer to Couchbase Cluster Auto-scaling Best Practices for additional guidance on how various metrics can be used as a measure of cluster performance.

      The Horizontal Pod Autoscaler can only monitor metrics through the Kubernetes API, therefore metrics affecting the Couchbase cluster must be exposed within the Kubernetes cluster before auto-scaling can be configured. Refer to About Exposed Metrics in the concept documentation for more information about how metrics can be exposed for the purposes of auto-scaling.

      Enabling Auto-scaling

      Enabling auto-scaling for a particular Couchbase cluster starts with modifying the relevant CouchbaseCluster resource.

      The required configuration parameters for enabling log forwarding are described in the example below. (The Autonomous Operator will set the default values for any fields that are not specified by the user.)

      Basic CouchbaseCluster Auto-Scaling Parameters
      apiVersion: couchbase.com/v2
      kind: CouchbaseCluster
      metadata:
        name: cb-example
      spec:
        servers:
        - name: data
          size: 3
          services:
          - data
        - name: index
          autoscaleEnabled: true (1)
          size: 2
          services:
          - index
        - name: query
          autoscaleEnabled: true (2)
          size: 2
          services:
          - query
        autoscaleStabilizationPeriod: 600s (3)
      1 couchbaseclusters.spec.servers.autoscaleEnabled: Setting this field to true triggers the Autonomous Operator to create a CouchbaseAutoscaler custom resource for the relevant server class. In this example, a CouchbaseAutoscaler resource will be created for the index server class. Refer to About the Couchbase Autoscaler for a conceptual overview of the role the CouchbaseAutoscaler custom resource plays in auto-scaling.
      2 In this example, a CouchbaseAutoscaler resource will also be created for the query server class.
      3 couchbaseclusters.spec.autoscaleStabilizationPeriod: This field defines the Couchbase Stabilization Period, which is an internal safety mechanism provided by the Autonomous Operator that is meant to help prevent over-scaling caused by metrics instability during rebalance. The value specified in this field determines how long HorizontalPodAutoscaler resources will remain in maintenance mode after the cluster finishes rebalancing.

      In this example, the stabilization period has been set to 600s, which means that the Horizontal Pod Autoscaler will not restart monitoring until 10 minutes after the previous rebalance has completed. Refer to Couchbase Cluster Auto-scaling Best Practices for additional guidance on setting this value in production environments.

      After deploying the CouchbaseCluster resource specification, the Autonomous Operator will create a CouchbaseAutoscaler resource for each server class configuration that has couchbaseclusters.spec.servers.autoscaleEnabled set to true.

      Enabling auto-scaling for a particular server class configuration does not immediately subject the cluster to being auto-scaled. The CouchbaseAutoscaler resource simply acts as an endpoint for the HorizontalPodAutoscaler resource to access the pods that are selected for auto-scaling. The CouchbaseAutoscaler resource is only activated when referenced by a HorizontalPodAutoscaler resource.

      Verifying Creation of CouchbaseAutoscaler Resources

      The following command can be used to verify that the CouchbaseAutoscaler custom resources exist and match the size of their associated server class configurations:

      $ kubectl get couchbaseautoscalers
      NAME                               SIZE   SERVERS
      index.cb-example                   2      index (1) (2)
      query.cb-example                   2      query
      1 NAME: Each CouchbaseAutoscaler resource is named using the format <server-class>.<couchbase-cluster>. The name is important as it must be referenced when creating the HorizontalPodAutoscaler resource in order link the two resources together.
      2 SIZE: This is the current number of Couchbase nodes that the Autonomous Operator is maintaining for the index server class. The Autonomous Operator keeps the size of a CouchbaseAutoscaler resource in sync with the size of its associated server class configuration.

      CouchbaseAutoscaler custom resources are fully managed by the Autonomous Operator and should not be manually created, modified, or deleted by the user. If one is manually deleted, the Autonomous Operator will re-create it. However, it is possible to edit the CouchbaseAutoscaler (refer to [scale-subresource] below).

      A CouchbaseAutoscaler resource only gets deleted by the Autonomous Operator when auto-scaling is disabled for the associated server class, or if the associated CouchbaseCluster resource is deleted altogether.

      Creating a HorizontalPodAutoscaler Resource

      The Autonomous Operator relies on the Kubernetes Horizontal Pod Autoscaler (HPA) to provide auto-scaling capabilities. The Horizontal Pod Autoscaler is configured via a HorizontalPodAutoscaler resource, which is the primary interface by which auto-scaling is configured. Unlike the CouchbaseAutoscaler custom resource created by the Autonomous Operator, the HorizontalPodAutoscaler resource is created and managed by the user.

      The following configuration represents an example to scale the server class from Enabling Auto-scaling.

      apiVersion: autoscaling/v2
      kind: HorizontalPodAutoscaler
      metadata:
        name: query-hpa
      spec:
        scaleTargetRef: (1)
          apiVersion: couchbase.com/v2
          kind: CouchbaseAutoscaler
          name: query.cb-example
        behavior: (2)
          scaleUp:
            policies: (3)
            - type: Pods
              value: 1
              periodSeconds: 15
            stabilizationWindowSeconds: 30 (4)
          scaleDown:
            stabilizationWindowSeconds: 300
        minReplicas: 2 (5)
        maxReplicas: 6
        metrics: (6)
        - type: Resource
          resource:
            name: cpu
            target:
              type: Utilization
              averageUtilization: 70
      1 The spec.scaleTargetRef section must be configured to reference the relevant CouchbaseAutoscaler resource.
      • apiVersion: This field must be set to be set to couchbase.com/v2.

      • kind: This field must be set to CouchbaseAutoscaler.

      • name: This field must reference the unique name of the CouchbaseAutoscaler custom resource. As discussed in the previous section, CouchbaseAutoscaler resources are automatically created by the Autonomous Operator using the name format <server-class>.<couchbase-cluster>.

      Refer to Referencing the Couchbase Autoscaler in the concept documentation for more detailed information about these fields.

      2 Fine-grained scaling behavior is configured via policies specified in the spec.behavior section. Different policies can be specified for scaling up (behavior.scaleUp) and scaling down (behavior.scaleDown). If no user-supplied values are specified in behavior fields, then the default values are used.
      3 spec.behavior.[].policies: Scaling policies are made up of the following fields: type, value, and periodSeconds. The recommended settings are type: Pods and value: 1, while leaving periodSeconds with the default value.

      Refer to Scaling Policies and Scaling Increments in the concept documentation for more detailed information about these fields.

      4 behavior.[].stabilizationWindowSeconds: A stabilization window can be configured as a means to control undesirable scaling caused by fluctuating metrics. A minimum scaleUp stabilization window of 30 seconds is generally recommended, unless indicated otherwise in Couchbase Cluster Auto-scaling Best Practices.
      5 The spec.minReplicas and spec.maxReplicas fields set the minimum and maximum number of Couchbase nodes for the associated server class.
      • minReplicas sets the boundary for the number of Couchbase nodes that the associated server class can ever be down-scaled to, and defaults to 1. This field is important for maintaining service availability.

      • maxReplicas sets the upper boundary for the number of Couchbase nodes that the associated server class can ever be up-scaled to, and cannot be set to a value lower than what is defined for minReplicas. This field is required, as it provides important protection against runaway scaling events.

      Refer to Sizing Constraints in the concept documentation for more detailed information about these fields.

      6 The spec.metrics section must target a specific metric, along with an associated threshold for that metric. In this example, a Kubernetes resource metric (cpu) is being targeted with a threshold set to 70 percent utilization.

      Refer to Target Metrics and Thresholds in the concept documentation for more detailed information about targeting metrics in the HorizontalPodAutoscaler resource.

      The HorizontalPodAutoscaler resource can be created like any other resource by submitting the specifications in a file using kubectl:

      $ kubectl apply -f query-hpa.yaml

      As soon as the HorizontalPodAutoscaler resource has been successfully created, the Horizontal Pod Autoscaler will begin to monitor the target metric and the cluster will be subject to auto-scaling if the targeted metric is above or below the configured threshold.

      Verifying HorizontalPodAutoscaler Status

      When the Horizontal Pod Autoscaler begins to monitor the target metric, it will begin reporting the value of metric along with the current vs desired size of the server class.

      Run the following command to print these details to the console output:

      $ kubectl describe hpa query-hpa
      Metrics:                                               ( current / target )
        resource cpu on pods  (as a percentage of request):  1% (50m) / 70% (1)
      Min replicas:                         2
      Max replicas:                         6
      CouchbaseAutoscaler pods:             2 current / 2 desired  (2)
      1 The current observed value of the metric is displayed vs the target threshold.
      2 The current size of the server class is displayed vs the desired size currently being recommended by the Horizontal Pod Autoscaler.

      Disabling Scale Down

      In production environments, it may be desirable to only allow a cluster to automatically scale up while requiring manual intervention to scale down. This can be accomplished by modifying the HorizontalPodAutoscaler resource as follows:

      apiVersion: autoscaling/v2
      kind: HorizontalPodAutoscaler
      metadata:
        name: example-hpa
      spec:
        behavior:
          scaleDown:
            selectPolicy: Disabled (1)
      1 spec.behavior.[].selectPolicy: This field controls which policy is chosen by the Horizontal Pod Autoscaler if more than one policy is defined. When set to Disabled no policy is chosen, and therefore auto-scaling is disabled in that direction.

      By setting spec.behavior.scaleDown.selectPolicy to Disabled, the Horizontal Pod Autoscaler will never recommend scaling down the associated server class. Specifying this setting for all HorizontalPodAutoscaler resources associated with a particular Couchbase cluster ensures that the cluster will never automatically scale down.

      Clusters that have automatic down-scaling disabled can be manually scaled down by editing the CouchbaseAutoscaler resource directly:

      $ kubectl scale --replicas=2 query.cb-example

      The above command edits the scale subresource and results in the Autonomous Operator scaling the server class named query to a size of 2.

      Disabling Auto-scaling

      Auto-scaling, having been enabled and configured for a Couchbase cluster, can subsequently be disabled.

      The recommended method for disabling auto-scaling is to set couchbaseclusters.spec.servers.autoscaleEnabled back to false for each of the desired server classes.

      apiVersion: couchbase.com/v2
      kind: CouchbaseCluster
      metadata:
        name: cb-example
      spec:
        servers:
        - name: index
          autoscaleEnabled: false (1)
          size: 2
          services:
          - index
        - name: query
          autoscaleEnabled: false
          size: 2
          services:
          - query
      1 couchbaseclusters.spec.servers.autoscaleEnabled: Setting this field to false triggers the Autonomous Operator to delete the CouchbaseAutoscaler custom resource that had previously been created for the relevant server class. In this example, the CouchbaseAutoscaler resource associated with the index server class will be deleted by the Autonomous Operator upon submitting the configuration.

      Upon deleting the CouchbaseAutoscaler resource, the Autonomous Operator will no longer reconcile the current size of the server class with the recommendations of the Horizontal Pod Autoscaler, and instead the value specified in couchbaseclusters.spec.servers.size will become the new source of truth. For example, if the above configuration were to be submitted, it would result in the index and query server classes each being scaled to size: 2 from whatever size they had previously been auto-scaled to.

      It’s important to note, however, that the HorizontalPodAutoscaler resource is not managed by the Autonomous Operator, and therefore does not get deleted along with the CouchbaseAutoscaler resource. It will continue to exist in the current namespace until it is manually deleted by the user. Since the HorizontalPodAutoscaler resource can continue to be used if auto-scaling is subsequently re-enabled, it is important to verify the status of the HorizontalPodAutoscaler resource to ensure that it is persisted as expected.

      If the desire is to only temporarily disable auto-scaling, the HorizontalPodAutoscaler resource can be left to persist until auto-scaling is eventually re-enabled. This only works if the names of both the server class and the Couchbase cluster remain the same, because when couchbaseclusters.spec.servers.autoscaleEnabled is set back to true, the Autonomous Operator will create a CouchbaseAutoscaler resource that is already referenced by the existing HorizontalPodAutoscaler resource. In this case, the cluster will immediately become subject to the recommendations of the Horizontal Pod Autoscaler.

      Deleting just the HorizontalPodAutoscaler resource will also have the effect of "disabling" auto-scaling. In this scenario, the Autonomous Operator continues to maintain the CouchbaseAutoscaler resource, but it will remain at the same size that was last recommended by the Horizontal Pod Autoscaler before it was deleted.