About Using Couchbase Server Groups With the Operator

Couchbase server groups, which enable Server Group Awareness, are a way of logically partitioning a cluster to be fault tolerant across failure domains. The Operator is capable of automatically scheduling pod creation across failure domains and ensuring that they are added to the correct server groups.

By default, buckets are logically partitioned into vBuckets, which are distributed across all Couchbase Server nodes, providing automated load distribution. Bucket replicas are similarly distributed so that the failure of a node - or multiple nodes if more than one replica is requested - does not result in data loss, and a replica can assume control of the affected vBuckets.

Server groups allow you to logically group Couchbase Server nodes so that they can represent physical racks, data centers, or in the case of a cloud deployment, availability zones. This allows vBucket replicas to be scheduled in a completely separate server group, guaranteeing that the Couchbase Server cluster is tolerant of larger infrastructure failures, such as switch failure and availability zone outage.

Operator Server Group Scheduling

Before discussing how to configure server groups, it’s best to understand how the Operator schedules pods and manages server groups.

Server Group Scheduling Overview

The diagram depicts a cluster with two server classes that directly map to server configurations that are defined in the specification in the property spec.servers. For example, Server Class 1 may be a multi-dimensional scaling group of nodes running the data service, and Server Class 2 is running a query service. Critically, server classes are independently scheduled, which simplifies algorithm complexity and ensures that all pods in Server Class 2 do not all end up in the same server group, and thus protects against a potential service outage.

Scheduling Couchbase Node Addition

When given a set of server groups to schedule, the Operator will poll the current system state and map server groups to pods. From there, the Operator will schedule a new pod to the server group that contains the fewest existing pods.

Scheduling Couchbase Node Deletion

Like addition, the Operator will build a map of server groups to pods. The Operator will delete an existing pod from the server group that contains the greatest number of existing pods.

The Operator pod scheduler attempts to keep the number of pods from each server class evenly distributed across all server groups. As such, the number of pods that are allocated to each server group should be at most one different from each of the others. It’s good practice to keep each server group equally populated. In the example above, it is recommended that Server Class 1 contains 9 pods with 3 scheduled per server group. This keeps data and index replicas equally balanced across the server groups, avoiding some replicas being scheduled in the same server group to maintain a balanced load across each server instance.

Configuring Kubernetes Node Topology

The Operator does not have any prior knowledge of which Kubernetes node resides where. While some cloud operators may provide labels to describe physical location, not all environments will, and certainly not bare metal installs. As such, the configuration is left up to the end user, which provides generic availability of the feature and flexibility.

Kubernetes nodes must be labelled by the end user in order to describe physical location. This can be done with the following example command:

kubectl label nodes ip-172-16-0-10 failure-domain.beta.kubernetes.io/zone=us-east-1a

The label value defined here will map directly to the Couchbase Server cluster configuration that is defined in the next section.

Some Kubernetes environments will set these labels up for you e.g. kops. Please consult the documentation for your environment or examine your cluster with kubectl get nodes to determine if the labels will be automatically populated.

Configuring the Couchbase Cluster

When configuring the cluster, you must specify the set of server groups that the Operator is allowed to schedule pods across.

By decoupling Kubernetes node configuration from cluster configuration, you are able to label Kubernetes nodes as running in availability zones 1, 2, 3, 4, then have a Couchbase Server cluster running in server groups 1 and 2, and another in 3 and 4. This facilitates a simple way of physically separating two distinct clusters.

At the top level, spec.serverGroups defines global defaults. All server classes will use this set to schedule their respective pods across.

This default can also be overridden at the server class level via the spec.servers.serverGroups property. If specified, this will take precedence over the global default by the Operator scheduler.

Pod scheduling is enabled if any of the server classes have serverGroups set, or the default is enabled. It is considered an invalid configuration when scheduling is enabled and a server class doesn’t have an explicit set of server groups or an inherited global default.
Server groups are immutable and cannot be changed once defined or added to an existing cluster.

Referring back to the example illustrated earlier, you could define the specification as follows:

---
spec:
  serverGroups:
    - ServerGroup1
    - ServerGroup2
    - ServerGroup3
  servers:
    - name: ServerClass1
      services:
        - data
      size: 7
    - name: ServerClass2
      services:
        - query
      serverGroups:
        - ServerGroup1
        - ServerGroup2
      size: 2