About Using Couchbase Server Groups With the Operator
Couchbase server groups, which enable Server Group Awareness, are a way of logically partitioning a cluster to be fault tolerant across failure domains. The Operator is capable of automatically scheduling pod creation across failure domains and ensuring that they are added to the correct server groups.
By default, buckets are logically partitioned into vBuckets, which are distributed across all Couchbase Server nodes, providing automated load distribution. Bucket replicas are similarly distributed so that the failure of a node - or multiple nodes if more than one replica is requested - does not result in data loss, and a replica can assume control of the affected vBuckets.
Server groups allow you to logically group Couchbase Server nodes so that they can represent physical racks, data centers, or in the case of a cloud deployment, availability zones. This allows vBucket replicas to be scheduled in a completely separate server group, guaranteeing that the Couchbase Server cluster is tolerant of larger infrastructure failures, such as switch failure and availability zone outage.
Before discussing how to configure server groups, it’s best to understand how the Operator schedules pods and manages server groups.
The diagram depicts a cluster with two server classes that directly map to server configurations that are defined in the specification in the property
spec.servers. For example, Server Class 1 may be a multi-dimensional scaling group of nodes running the data service, and Server Class 2 is running a query service. Critically, server classes are independently scheduled, which simplifies algorithm complexity and ensures that all pods in Server Class 2 do not all end up in the same server group, and thus protects against a potential service outage.
When given a set of server groups to schedule, the Operator will poll the current system state and map server groups to pods. From there, the Operator will schedule a new pod to the server group that contains the fewest existing pods.
Like addition, the Operator will build a map of server groups to pods. The Operator will delete an existing pod from the server group that contains the greatest number of existing pods.
|The Operator pod scheduler attempts to keep the number of pods from each server class evenly distributed across all server groups. As such, the number of pods that are allocated to each server group should be at most one different from each of the others. It’s good practice to keep each server group equally populated. In the example above, it is recommended that Server Class 1 contains 9 pods with 3 scheduled per server group. This keeps data and index replicas equally balanced across the server groups, avoiding some replicas being scheduled in the same server group to maintain a balanced load across each server instance.|
The Operator does not have any prior knowledge of which Kubernetes node resides where. While some cloud operators may provide labels to describe physical location, not all environments will, and certainly not bare metal installs. As such, the configuration is left up to the end user, which provides generic availability of the feature and flexibility.
Kubernetes nodes must be labelled by the end user in order to describe physical location. This can be done with the following example command:
kubectl label nodes ip-172-16-0-10 failure-domain.beta.kubernetes.io/zone=us-east-1a
The label value defined here will map directly to the Couchbase Server cluster configuration that is defined in the next section.
Some Kubernetes environments will set these labels up for you e.g. kops. Please consult the documentation for your environment or examine your cluster with
When configuring the cluster, you must specify the set of server groups that the Operator is allowed to schedule pods across.
|By decoupling Kubernetes node configuration from cluster configuration, you are able to label Kubernetes nodes as running in availability zones 1, 2, 3, 4, then have a Couchbase Server cluster running in server groups 1 and 2, and another in 3 and 4. This facilitates a simple way of physically separating two distinct clusters.|
At the top level,
spec.serverGroups defines global defaults. All server classes will use this set to schedule their respective pods across.
This default can also be overridden at the server class level via the
spec.servers.serverGroups property. If specified, this will take precedence over the global default by the Operator scheduler.
Pod scheduling is enabled if any of the server classes have
|Server groups are immutable and cannot be changed once defined or added to an existing cluster.|
Referring back to the example illustrated earlier, you could define the specification as follows:
--- spec: serverGroups: - ServerGroup1 - ServerGroup2 - ServerGroup3 servers: - name: ServerClass1 services: - data size: 7 - name: ServerClass2 services: - query serverGroups: - ServerGroup1 - ServerGroup2 size: 2