Couchbase Autonomous Operator 2.1 is an incremental release that adds new features to the Operator 2.0 series. The changes in this release enable several advancements in how you can manage, monitor, and deploy Couchbase clusters.
Take a look at the release notes for a more complete list of changes in this release, including known and fixed issues.
The main new feature of this release is cluster autoscaling.
It was decided at an early stage that having the Operator control scaling of the cluster would be both restrictive and inflexible. Instead, the Operator creates custom resources — on demand — that implement the scale API.
The scale API is relatively simple. It defines where an external controller can look within a resource to determine the current size of a resource (in our case the cluster as a whole or a multi-dimensional scaling group), and where to modify the requested size. By using the scale API, Couchbase clusters are directly compatible with the Kubernetes horizontal pod autoscaler (HPA).
The Operator can now delegate scaling responsibilities to the HPA. The HPA allows high-level functionality such as allowing scaling to be driven by Prometheus metrics, and scaling decisions to be constrained, for example introducing delays to even out load spikes.
While this feature is generic, and will work for any server class, it has been inhibited to only work with server classes comprised of the stateless query service, and only when all buckets are ephemeral, by default.
This restriction can be removed with the
spec.enablePreviewScaling configuration switch, allowing — for example — the data service to be scaled when using standard Couchbase buckets.
Refer to Couchbase Cluster Autoscaling for more information.
See also: Autoscaling Couchbase Stateless Services (tutorial)
The Operator’s persistence and caching layer has been upgraded in version 2.1. This now enables fault-tolerant things like password and TLS rotation.
Cluster Administrator passwords can now be rotated by modifying the administrative authentication secret.
There may be situations where a Couchbase cluster need not be running, for example when not needed in a development environment. By allowing a Couchbase cluster to be hibernated, the Operator frees up unneeded compute resources required by cluster pods, and can lead to cost savings when used in conjunction with the Kubernetes cluster autoscaler.
When performing upgrades, the Operator takes the safe approach — upgrading one pod at a time. The major drawback to this approach is that it takes a long time, scaling linearly with the cluster size.
Operator version 2.1 introduces upgrade policies that can control upgrades. While the default policy is the same as before, you may set it to upgrade all pods at once. This will reduce the time required for an upgrade substantially, at the expense of a more resource intensive operation.
The Couchbase cluster and Couchbase backup resources have been extended to allow use of new cloud backup features released in Couchbase Server 6.6.
The global backup configuration for a cluster now accepts a secret containing Amazon AWS credentials that permit access to S3 object storage. Each backup resource may also be configured with an S3 bucket. By using an S3 bucket, backups are stored to the cloud instead of a Kubernetes persistent volume. This provides true off-site backups, decoupled from the underlying Kubernetes cluster.
With the coming of Couchbase Server 6.5, comes the ability to use Kubernetes ingresses to expose the administrative UI outside of the Kubernetes cluster.
Full support for Istio service mesh is now provided, and may be enabled in the Couchbase cluster networking configuration. Both basic and mTLS modes of operation are supported.
Please note that service meshes (including Istio), must already be enabled in the namespace before you install the Autonomous Operator and provision any Couchbase clusters. Therefore, in order to use Istio with an existing Autonomous Operator deployment, you’ll need to recreate the deployment in a namespace that already has Istio enabled, and then migrate any cluster data to the new deployment using methods such as XDCR.
Upgrading a previous release to Operator 2.1 follows the usual process, however there are some things you should be aware of.
The technical lower limit of the Operator is Kubernetes 1.14. Ensure your Kubernetes cluster is running this version or newer before upgrading.
The TLS requirements have been modified. In order to ease the migration from legacy client bootstrap (CCCP) to the newest version (GCCCP), the Operator requires Couchbase cluster subject alternative names (SANs) to be updated. Consult the TLS documentation for a full list of all the required SANs and the TLS rotation how-to in order to prepare for upgrade. Failure to perform this step will result in errors from the dynamic admission controller (DAC) once upgraded.
When upgrading Couchbase clusters from an earlier version of the Operator (prior to version 2.1), the cluster will undergo a mandatory upgrade cycle.
Pod readiness were previously driven by an
exec based readiness probe.
This was a security concern as it granted the Operator
pods/exec privileges, and may be unacceptable in highly regulated environments.
As of Operator version 2.1 this will now be performed using readiness gates, that use the Kubernetes API exclusively.
You may use the Bulk Upgrades feature to speed up this upgrade. To enable during the Operator upgrade, stop the old Operator, replace the CRDs, edit the Couchbase cluster resource to enable bulk upgrades before restarting the new Operator.