Backing up and Restoring a Couchbase Deployment

The Couchbase Autonomous Operator automatically repairs and rebalances Couchbase clusters to maintain high availability. It is considered best practice to regularly backup your data, and also test restoring it works as expected before disaster recovery is required.

This functionality is not provided by the Operator and left to the cluster administrator to define backup policies and test data restoration. This section describes some common patterns that may be employed to perform the required functions.

Backing Up a Couchbase Deployment

The Kubernetes resource definitions below illustrate a typical arrangement for backup that saves the state of the entire cluster.

# Define backup storage volume
kind: PersistentVolumeClaim
apiVersion: v1
metadata:
  name: couchbase-cluster-backup
spec:
  resources:
    requests:
      storage: 100Gi
  storageClassName: standard
  accessModes:
    - ReadWriteOnce

A persistent volume is claimed in order to keep data safe in the event of an outage. You will need to plan the claim size based on your expected data set size, the number of days data retention and whether incremental backups are used at all.

# Create a backup repository
kind: Job
apiVersion: batch/v1
metadata:
  name: couchbase-cluster-backup-create
spec:
  template:
    spec:
      containers:
        - name: couchbase-cluster-backup-create
          image: couchbase/server:enterprise-5.5.2
          command: ["cbbackupmgr", "config", "--archive", "/backups", "--repo", "couchbase"]
          volumeMounts:
            - name: "couchbase-cluster-backup-volume"
              mountPath: "/backups"
      volumes:
        - name: couchbase-cluster-backup-volume
          persistentVolumeClaim:
            claimName: couchbase-cluster-backup
      restartPolicy: Never

A job is created to mount the persistent volume and initialize a backup repository. The repository is named couchbase which will map to the cluster name in later specifications.

kind: CronJob
apiVersion: batch/v1beta1
metadata:
  name: couchbase-cluster-backup
spec:
  schedule: "0 3 * * *"
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: couchbase-cluster-backup-full
              image: couchbase/server:enterprise-5.5.2
              command: ["cbbackupmgr", "backup", "--archive", "/backups", "--repo", "couchbase", "--cluster", "couchbase://couchbase.default.svc", "--username", "Administrator", "--password", "password"]
              volumeMounts:
                - name: "couchbase-cluster-backup-volume"
                  mountPath: "/backups"
          volumes:
            - name: couchbase-cluster-backup-volume
              persistentVolumeClaim:
                claimName: couchbase-cluster-backup
          restartPolicy: Never

A backup cron job runs daily at a time that had low cluster utilization. The first time it runs a full backup is taken, for other runs an incremental backup is run to save disk space.

You may merge the previous two steps into a single job using init-containers to chain commands together. The repository creation step should be skipped if already created. It is also possible using this technique to run cbbackupmgr compact to further reduce disk utilization.
By creating a custom container image it is possible to trigger the backup job via a script. The script is able to reference Secret resources from the Kubernetes API which may contain Couchbase login credentials.
We use the cluster SRV record as the --cluster argument here as cluster member names and IP addresses will change during the cluster lifestyle. By using service discovery, the connection string name remains stable and will automatically select a Couchbase server instance which is alive.
kind: Job
apiVersion: batch/v1
metadata:
  name: couchbase-cluster-backup-merge
spec:
  spec:
    template:
      spec:
        containers:
          - name: couchbase-cluster-backup-prune
            image: couchbase/server:enterprise-5.5.2
            command: ["cbbackupmgr", "merge", "--archive", "/backups", "--repo", "couchbase", "--start", "2018-07-25T13_02_45.92773833Z", "--end", "2018-07-25T14_57_57.83339572Z"]
            volumeMounts:
              - name: "couchbase-cluster-backup-volume"
                mountPath: "/backups"
        volumes:
         - name: couchbase-cluster-backup-volume
            persistentVolumeClaim:
              claimName: couchbase-cluster-backup
        restartPolicy: Never

A merge job should be run periodically to compact full and incremental backups into a single full backup. This step will reduce disk utilization.

Although this example illustrates the use of a Job for merging backups it is possible to automate this. Simply deploy as a CronJob and execute a script which runs cbbackupmgr list to discover backups, then pass these as --start and --end parameters to the merge command.
Merging backups should form part of a regular backup strategy. By doing so stale documents are removed from the backup repository as they are deleted in incremental backups. This helps adhere to data protection legislation.

Restoring a Couchbase Deployment

Much like a backup we can restore data to a new Couchbase cluster with a Kubernetes Job.

kind: Job
apiVersion: batch/v1
metadata:
  name: couchbase-cluster-restore
spec:
  template:
    spec:
      containers:
        - name: couchbase-cluster-restore
          image: couchbase/server:enterprise-5.5.2
          command: ["cbbackupmgr", "restore", "--archive", "/backups", "--repo", "couchbase", "--cluster", "couchbase://couchbase.default.svc", "--username", "Administrator", "--password", "password"]
          volumeMounts:
            - name: "couchbase-cluster-backup-volume"
              mountPath: "/backups"
      volumes:
        - name: couchbase-cluster-backup-volume
          persistentVolumeClaim:
            claimName: couchbase-cluster-backup
      restartPolicy: Never