Couchbase Cluster Autoscaling

    +
    The Autonomous Operator is capable of autoscaling a cluster based on observable metrics. Couchbase autoscaling works in conjunction with Kubernetes HorizontalPodAutoscaler. You will need to be aware of how metrics are monitored by HorizontalPodAutoscaler and applied to CouchbaseAutoscaler to ensure stable operation.
    autoscale overview
    Figure 1. Overview Autoscaling

    Enabling Couchbase Autoscaling

    Autoscaling is only supported by Couchbase clusters with stateless server configurations. A server configuration is considered stateless when the following conditions are met:

    1. All buckets are defined as ephemeral buckets using the CouchbaseEphemeralBucket resource.

    2. At least one group of servers is configured to run only the query service.

    The following defines a set of query services that can be autoscaled:

    ---
    apiVersion: couchbase.com/v2
    kind: CouchbaseCluster
    metadata:
      name: cb-example
    spec:
      servers:
        - size: 2
          name: data
          services:
            - data
            - index
        - size: 3
          name: query
          services:
            - query (1)
          autoscaleEnabled: true (2)
    1 Configurations with only the stateless query service are permitted to autoscale.
    2 autoscaleEnabled must be to true to enable autoscaling of query service nodes.
    All buckets associated with the cluster must be defined as CouchbaseEphemeralBucket resources.

    Preview Autoscaling Mode

    Autoscaling can be enabled in a preview mode for services other than query along with buckets other than CouchbaseEphemeralBucket. Set enablePreviewScaling to true within the CouchbaseCluster resource to enable preview scaling mode.

    ---
    apiVersion: couchbase.com/v2
    kind: CouchbaseCluster
    metadata:
      name: cb-example
    spec:
      enablePreviewScaling: true (1)
      servers:
        - size: 3
          name: data
          services:
            - data
            - index
          autoscaleEnabled: true (2)
    1 Enabling preview scaling mode to allow autoscaling of stateful services
    2 Autoscaling is now allowed for data and index services

    Enabling preview autoscaling is unsupported and for experimental purposes only. While preview autoscaling is powerful it should be used with caution as it may increase resource usage at exactly the time Couchbase is under pressure for resources, impacting service levels. Furthermore, Couchbase cannot be stopped while rebalance operations are being performed in an attempt to maintain service levels. Therefore, when enabling preview autoscaling, much consideration should be given to the way an observed metric behaves during cluster topology changes.

    How Autoscaling Works

    The Autonomous Operator will create a CouchbaseAutoscaler resource for each server configuration with autoscaleEnabled set to true. Once created, the Operator will keep the size of the CouchbaseAutoscaler resource in sync with the size of its associated server configuration.

    autoscale crd
    Figure 2. Autoscale Resource

    The size of a CouchbaseAutoscaler resource is then adjusted by a Kubernetes Horizontal Pod Autoscaler when target metric values are reached. Couchbase is then autoscaled as changes to a HorizontalPodAutoscaler are propagated through the associated CouchbaseAutoscaler.

    autoscale crd hpa
    Figure 3. Autoscale Apply

    Managed Autoscale Resources

    The Operator creates CouchbaseAutoscale based on the name of its associated server configuration along with the name of the Couchbase cluster. For example, a server configuration named query will result in a CouchbaseAutoscale resource named query.cb-example.

    As with all Custom resources, CouchbaseAutoscale resources can be listed with kubectl:

    $ kubectl get couchbaseautoscalers

    The CouchbaseAutoscaler resources are fully managed by the Autonomous Operator and should not be manually created or deleted.

    scale subresource

    The CouchbaseAutoscaler implements the Kubernetes Scale subresource which allows the CouchbaseAutoscaler to be used as a target reference for HorizontalPodAutoscaler resources. Implementing the /scale subresource allows CouchbaseAutoscaler resources to perform similar resizing operations as native Kubernetes deployments. Therefore, manual scaling is also possible using kubectl:

    $ kubectl scale --replicas=6 query.cb-example

    The above command results in scaling the server configuration named query to a size of 6. The Autonomous Operator monitors the value of CouchbaseAutoscaler.spec.size and applies the value as spec.servers[].size to the associated server configuration.

    Exporting Metrics

    Couchbase Autoscaling can be performed based on target values of observed metrics. Metrics can be collected from from either the resource metrics API, or the custom metrics API.

    Exposing Resource Metrics

    The resource metrics API exposes resource metrics such as cpu and memory values from Pods and Nodes. The resource metrics are provided by the metrics-server and may need to be launched as a cluster add-on. Run the following command to verify that your cluster is capable of performing autoscaling based on resource metrics:

    $ kubectl get --raw /apis/metrics.k8s.io/v1beta1

    The response should contain an APIResourceList with the type of resources that can be fetched. If you receive a NotFound error then you will need to install the metrics-server, if you plan on performing autoscaling based on values from the resource metrics API.

    Exposing Couchbase Metrics

    Couchbase Metrics can be exposed through the custom metrics API. The custom metrics API exposes 3rd-party metrics to the Kubernetes API server. For Couchbase Autoscaling, this means that Couchbase metrics related to memory quota and query latency can be used as targets to determine when autoscaling should occur. Run the following command to verify that your cluster is capable of performing autoscaling based on custom metrics:

    $ kubectl get --raw /apis/custom.metrics.k8s.io/v1beta1
    The Couchbase cluster needs to running with monitoring.prometheus.enabled: true in order for Couchbase metrics to be collected by the custom metrics API service.

    If you receive a NotFound error then you will need to install a custom metrics service. The recommended custom metrics service to use with the Operator is the Prometheus Adapter since the metrics are being exported Prometheus.

    Connecting Horizontal Pod Autoscaler

    The Horizontal Pod Autoscaler is a central component to autoscaling as it that observes target metrics and sends sizing requests to the CouchbaseAutoscaler. As mentioned previously, sizing requests from the Horizontal Pod Autoscaler propagate to Couchbase Server configurations.

    autoscale hpa connection
    Figure 4. HPA Workflow

    To Connect a HorizontalPodAutoscaler resource to a CouchbaseAutoscaler resource, the scaleTargetRef value must be set within HorizontalPodAutoscalerSpec.

    kind: HorizontalPodAutoscaler
    apiVersion: autoscaling/v2beta2
    metadata:
      name: query-hpa
    spec:
      scaleTargetRef:
        apiVersion: couchbase.com/v2
        kind: CouchbaseAutoscaler (1)
        name: query.cb-example (2)
    1 The target ref kind is set to CouchbaseAutoscaler which implements the /scale API.
    2 The name of the CouchbaseAutoscaler resource being referenced.

    Target Metrics

    The Horizontal Pod Autoscaler is capable of targeting any metric exposed to the Kubernetes API. As discussed previously, these metrics can originate from the resource metric server or custom metric server. When using custom metrics for performing autoscaling based on Couchbase Server metrics, the discovery of available metrics can be performed through Prometheus queries and is beyond the scope of this document. An alternative method of discovery is to check the couchbase exporter repository for the names of the metrics being exported (remember to include the cb prefix to name of the metric).

    The following example shows how to define target values around the cbquery_requests_1000ms metric as an objective for performing Autoscale operations:

    kind: HorizontalPodAutoscaler
    apiVersion: autoscaling/v2beta2
    metadata:
      name: query-hpa
      metrics:
      - type: Pods
        pods:
          metric:
            name: cbquery_requests_1000ms (1)
          target:
            type: AverageValue (2)
            averageValue: 7 (3)
    1 Targeting Couchbase metric for number of requests which exceed 1000ms.
    2 AverageValue type means that the metric will be averaged across all of the Pods.
    3 Setting 7 queries at 1000ms as an operational baseline.

    In the above example, the autoscaler will take action when queries with a latency of 1000ms exceed 7 (per second). Also, if the number of queries fall below this value then the autoscaler will consider scaling down to reduce overhead. Details about how sizing decisions are made are discussed in the following section.

    Sizing Constraints

    The Horizontal Pod Autoscaler applies constraints to the sizing values that are allowed to be propagated to the CouchbaseAutoscaler. Specifically, defining minReplicas and maxReplicas within the HorizontalPodAutoscalerSpec sets lower and upper boundaries for autoscaling.

    kind: HorizontalPodAutoscaler
    apiVersion: autoscaling/v2beta2
    metadata:
      name: query-hpa
    spec:
      ...
      minReplicas: 1
      maxReplicas: 6

    When the Horizontal Pod Autoscaler detects that a metric is relatively above or below a target value, the requested number of replicas will never fall beyond the set boundary for min or max replicas. Refer to the algorithm details of the Horizontal Pod Autoscaler for information about how scaling decisions are determined.

    Sizing Policies

    Additional control over autoscaling decisions are available in Kubernetes v1.18 through the behavior field of the HorizontalPodAutoscalerSpec. There are several supported policies such as stabilization windows along with scale up and down disabling and rate of change control.

    For example to prevent scaling down:

    behavior:
      scaleDown:
        selectPolicy: Disabled

    Refer to Kubernetes documentation for additional examples and information related to scaling policies.

    Example Specification

    The following is a complete example of a HorizontalPodAutoscaler which incorporates all the ideas of the previous section. You will need to edit this according to the name of your server configuration and cluster.

    ---
    kind: HorizontalPodAutoscaler
    apiVersion: autoscaling/v2beta2
    metadata:
      name: query-hpa
    spec:
      scaleTargetRef:
        apiVersion: couchbase.com/v2
        kind: CouchbaseAutoscaler
        name: query.cb-example
      # autoscale between 1 and 6 replicas
      minReplicas: 1
      maxReplicas: 6
      metrics:
      - type: Pods
        pods:
          metric:
            name: cbquery_requests_1000ms
          target:
            type: AverageValue
            averageValue: 7000m

    Including Cluster Autoscaler

    Kubernetes Cluster Autoscaler provides a means to autoscale the underlying Kubernetes nodes. Cluster Autoscaling is highly recommended for production deployments as it adds an additional dimension of scalability for adding and removing Pods since the underlying physical hardware is being scaled alongside of the Couchbase Cluster. Also, since production deployments tend to schedule Pods with specific resource limits and requests with performance expectations, Cluster Autoscaling allow for 1 to 1 matching of node with the underlying resources without concern that Node is sharing resources with several Pods.

    Several cloud offerings such as EKS and AKS offer Cluster Autoscaling with their Kubernetes Offerings. Couchbase Autoscaling will work with Cluster Autoscaling without any additional configuration. As the Horizontal Pod Autoscaler requests additional Couchbase Pods, resource pressure will be applied (or removed) from Kubernetes, and Kubernetes will automatically add (or remove) the number of required physical nodes.

    Refer to the following documentation for additional information about Cloud Providers which offer Cluster Autoscaling and how to configure it for your environment.