Using Persistent Volumes

Persistent Volumes offer a way to create Couchbase Pods with data which resides outside of the actual Pod.

This decoupling provides a higher degree of resilience to data within the Couchbase Cluster in the event that a node goes down or its associated Pod is terminated. Refer to Node Recovery additional information about the recovery of Pods with Persistent Volumes.g

To create a Couchbase Cluster with Persistent Volumes, volumeMounts need to be added to the pod within a group of spec.servers. These volumeMounts will only be used by the pods within its spec.servers config. See CouchbaseCluster Configuration Guide to understand how to define a cluster with Persistent Volumes. It is also recommended to have an overall understanding of kubernetes Persistent Volumes prior to creating a cluster with Persistent Volumes.

Benefits of using Persistent Volumes

  • Data Recoverability: Persistent Volumes allow the data associated within Pods to be recovered in the case that a Pod is terminated. This helps to prevent data-loss and to avoid time-consuming index building when using the data or index services.

  • Pod Relocation: Kubernetes may decide to evict pods that reach resource thresholds such as Cpu and Memory Limits. Pods that are backed with Persistent Volumes can be terminated and restarted on different nodes without incurring any downtime or data-loss.

  • Dynamic Provisioning: The Operator will create Persistent Volumes on-demand as your cluster scales, alleviating the need to pre-provision your cluster storage prior to deployment.g

  • Cloud Integration: Kubernetes integrates with native storage provisioners available on major cloud vendors such as AWS and GCE.

Best Practices when using Persistent Volumes

  • When persisting data for any service, then all other pods with same service should have Persistent Volumes. For example, configurations which spread data service across multiple server configs (spec.servers), should also specify volumeMounts spec.servers[*].pod.volumeMounts for each server config.

    • Also, ensure that each consistent storage sizes are used across all the Pods for a service service. For example, it is not recommended to use a 10GB template for 1 set of data nodes and a 100GB template for another.

  • It is easier to scale out a service storage when services are not all grouped into the same config spec.servers[*].services. This is because when multiple services are defined within the same group of servers then you will have to scale out storage for all of the services within that group along with the one that actually needs to be scaled. Note: volume resizing is not possible

  • The following storage provisioners are supported for creating Persistent Volumes:

    • AWS

    • GCE

    • Glusterfs

    • Azure Disk

    • Ceph RBD

    • Portworx Volume

    • Refer to documentation about Storage Provisioners for more information about storage classes.

  • Configure storage classes based on service needs. For example, the AWS provisioner can produce EBS volumes of type 'io1' which are faster than 'gp2', so you may want to use faster storage for data service if write throughput will be high.

  • For deployments in clouds with multiple availability zones see Persistent Volume Zoning Guide.