Cloud Environment Setup

This section documents guides for supported cloud platforms:

  • Amazon Elastic Kubernetes Service (EKS)

  • Google Kubernetes Engine (GKE)

  • Microsoft Azure Kubernetes Service (AKS)

Common Best Practices

All supported cloud platforms will have their own subtle differences. However, there are some common behaviors to be aware of.

Zoning and Persistent Volumes

By default, when provisioning persistent volumes in a cloud provider, they will be scheduled across all possible availability zones in the Kubernetes cluster. The persistent volumes are scheduled before they are attached to a pod, and cannot be manually scheduled. When attached to pods, these persistent volumes must be located in the same availability zone as the pod. This poses a challenge when using multiple persistent volumes per pod, as they may be scheduled in different availability zones. When using the server group feature of the Operator, the explicit pod scheduling may clash with that of the persistent volume.

The first solution is to run the Kubernetes cluster in a single zone. In this scenario, all pods and persistent volumes will be scheduled in a single availability zone. No scheduling conflicts can occur. The major downside with this approach is that the server group feature of the Operator cannot be used so a data center failure cannot be tolerated.

The second solution is only available on Kubernetes versions 1.12 and higher. Storage classes, from which persistent volumes are dynamically created, now support a VolumeBindingMode parameter with a WaitForFirstConsumer value. When set to WaitForFirstConsumer, persistent volumes will be scheduled only once attached to a pod. This allows the Operator to schedule pods explicitly to support server groups and multiple volumes to be attached to a single pod. This is the recommended deployment method.

The WaitForFirstConsumer feature first appeared in Kubernetes in 1.12. However, it is regarded as production-ready in versions 1.13 and higher.

Zoning and Server Groups

The operator allows you to define your own Couchbase server groups by labeling nodes with failure-domain.beta.kubernetes.io/zone and configuring your clusters to use them with the spec.serverGroups attribute. An example of this would be to label physical nodes based on rack location within a data center to make your Couchbase deployment tolerate switch and PDU failures.

With cloud-based deployments, the Kubernetes service automatically configures these failure domain labels for you. Furthermore, they should not be modified by the end user. When persistent volumes are provisioned and attached to a pod, their failure domain labels are populated by Kubernetes and must match those of the node that the pod is scheduled onto.

When configuring your cluster to use server groups, you can poll the Kubernetes API for valid failure domains with the following command:

kubectl get nodes -o yaml | grep failure-domain.beta.kubernetes.io/zone | sort | uniq
      failure-domain.beta.kubernetes.io/zone: us-east1-b
      failure-domain.beta.kubernetes.io/zone: us-east1-c
      failure-domain.beta.kubernetes.io/zone: us-east1-d

Valid server groups to use with your Couchbase clusters would be us-east1-b, us-east1-c and us-east1-d.