Using Persistent Volumes

Kubernetes persistent volumes offer a way to create Couchbase pods with data that resides outside of the actual pod.

Decoupling the data from the pod provides a higher degree of resilience within the Couchbase cluster in the event that a node goes down or its associated pod is terminated. The Operator can then attach persistent volumes to a new pod during node recovery.

Benefits of Using Persistent Volumes

Data Recoverability

Persistent volumes allow the data associated within pods to be recovered in the case that a pod is terminated. This helps to prevent data-loss and to avoid time-consuming index building when using the Data or Index services.

Pod Relocation

Kubernetes may decide to evict pods that reach resource thresholds such as CPU and memory limits. Pods that are backed with persistent volumes can be terminated and restarted on different nodes without incurring any downtime or loss of data.

Dynamic Provisioning

The Operator will create persistent volumes on demand as your cluster scales, alleviating the need to pre-provision your cluster storage prior to deployment.

Cloud Integration

Kubernetes integrates with native storage provisioners available on major cloud vendors such as AWS, GCP, and Azure.

Configuring the Operator to Use Persistent Volumes

Prerequisites:

To create a Couchbase cluster that uses persistent volumes, you need to add volumeMounts under pod within a group of spec.servers in the CouchbaseCluster configuration. These volumeMounts will only be used by the pods within its spec.servers config.

The following example shows how to mount a volume to persist data for the Data service:

apiVersion: couchbase.com/v1
kind: CouchbaseCluster
spec:
  servers:
    - size: 1
      name: data_services
      services:
        - data
  servers:
  - pod:
      volumeMounts:
        default: couchbase
        data: couchbase
  ...
  volumeClaimTemplates:
    - metadata:
        name: couchbase
      spec:
        storageClassName: "gp2"
        resources:
          requests:
            storage: 1Gi

The above spec uses a StorageClass named gp2 to request persistent volumes of size 1Gi for both the default and data mounts specified in volumeMounts. Claims can request volumes from various types of storage systems as identified by the storage class name. It’s also possible to claim manually-created volumes as outlined in the persistent volume persistent-volume-setup.adoc.

Best Practices for Persistent Volumes

  • When persisting data for any Couchbase service, all other pods running the same service should also use persistent volumes. For example, configurations which spread Data service across multiple server configs (spec.servers) should also specify spec.servers[*].pod.volumeMounts for each server config.

    Also, ensure that consistent storage sizes are used across all pods for a given service. For example, it’s not recommended to use a 10 GB template for one set of data nodes, and a 100 GB template for another.

  • It’s easier to scale out a service’s storage when services aren’t all grouped into the same config (spec.servers[*].services). This is because when multiple services are defined within the same group of servers, you’ll then have to scale out storage for all of the services within that group, along with the one that actually needs to be scaled. Note that you can’t resize volumes.

  • Configure storage classes based on service needs. For example, you may want to use faster storage for the Data service if write throughput will be high. In that case, you could use the AWS provisioner to produce EBS volumes of type io1 (good for large database workloads), which are faster than gp2 (the EKS default).

  • For deployments in clouds with multiple availability zones, refer to the Persistent Volume Zoning Guide.