Nodes
A Couchbase-Server cluster consists of one or more nodes, each of which is a system running an instance of Couchbase Server.
Nodes and their Creation
A Couchbase Server node is a physical or virtual machine that hosts a single instance of Couchbase Server. The establishment of the server on the node entails four stages:
-
Installed. Couchbase Server has been fully installed on the node, but not yet started.
-
Started. Couchbase Server has been started. Set-up can now be performed, using Couchbase Web Console, the CLI, or the REST API.
-
Initialized. Optionally, up to four custom paths have been specified on the current node, respectively corresponding to the locations at which data for the Data, Index, Analytics, and Eventing Services are to be saved. Note that if this stage is skipped, and initialization therefore not explicitly performed, path-setting may occur during subsequent provisioning.
-
Provisioned. The username and password for the Full Administrator must have been specified. Additionally, services, service memory-quotas, and buckets may have been specified.
When any variant of stage 4 has been achieved, the node is considered to be provisioned, and thereby, to be a cluster of one server; and re-initialization is not permitted.
Note that running multiple instances of Couchbase Server on a single node is not supported.
Paths for Data, Indexes, Analytics, and Eventing
The full, default path for each supported platform is shown in the following table:
Platform | Default directory |
---|---|
Linux |
/opt/couchbase/var/lib/couchbase/data |
Windows |
C:\Program Files\couchbase\server\var\lib\couchbase\data |
Mac OS X |
~/Library/Application Support/Couchbase/var/lib/couchbase/data |
Note that once it has been specified, the data-file path location should not be used to store any data other than that allocated by Couchbase Server; since all such additional data will be deleted.
The data-file path cannot be spontaneously changed on a node that is active within a cluster: therefore, you should apply the correct, permanent data-file path at initialization, prior to provisioning. If a new path is required for a node that has already been provisioned, the node must be reinitialized; which means that all results of prior provisioning are erased.
Clusters
A Couchbase cluster consists of one or more systems, each running Couchbase Server. An existing cluster can be incremented with additional nodes.
Note that Couchbase has modified the license restrictions to its Community Edition package for Couchbase Server Version 7.0 and higher. In consequence, the size of an individual cluster running Community Edition is restricted to five nodes. See Couchbase Modifies License of Free Community Edition Package, for further information on the new restrictions; and see Node Management and Community Edition, for information on how this affects the experience of Community-Edition administrators.
Incrementing a cluster with additional nodes can be accomplished in either of the following ways:
-
The routine for adding a new node to the existing cluster is executed on the existing cluster.
An instance of Couchbase Server must have been installed on the available node, and must be at stage 2, 3, or 4: that is, must itself be started and uninitialized; or started and initialized; or started, initialized, and provisioned (which means, itself a cluster of one node). Adding the available node means that:
-
Any custom paths already established on the available node are kept unchanged on the available node. This allows each individual node within a cluster to maintain disk-space for data, index, analytics, and eventing in its own, node-specific location. If the node is being added by means of Couchbase Web Console, the paths can be modified further, or reverted to the defaults, as part of the addition process.
-
If the node is at stage 4, all the results of its prior provisioning are deleted. This includes services, memory-quotas, buckets, bucket-data, and Full Administrator username and password.
-
The services and memory-quotas that are currently the default for the cluster can be optionally assigned to the node that is being added. However, an error occurs if the node does not have sufficient memory. Services and memory-quotas for the node can be configured to be other than the default. Alternatively, the default itself can be changed, provided that it does not require more of a given resource than is available on every node currently in the cluster.
-
-
The routine for joining an existing cluster is executed on the new node.
The available node must be at stage 2 or 3: that is, it must have been started, and may have had its data, index, analytics, and eventing paths configured. However, it cannot have been provisioned in any way: if the routine for joining is executed on a provisioned node, an error is flagged, and the operation fails.
Note that services can nevertheless be assigned to the new node during the join operation itself. The memory quota for each service defaults to the setting for the existing cluster. An error occurs if the new node does not have sufficient memory.
If Couchbase Web Console is used to perform the join, the data, index, analytics, and eventing paths can be modified as part of the join operations.
Once a cluster has been created, any of the IP addresses of the cluster-nodes can be used to access data and services. Therefore, provided that one node in the cluster is running the Data Service, the IP address of another node - one that is not running the Data Service - can be specified, in order to access the Data Service: the Cluster Manager ensures that all requests are appropriately routed across the cluster.
Restricting the Addition and Joining of Nodes
To ensure cluster-security, in Couchbase Server Version 7.1.1+, restrictions can be placed on addition and joining, based on the establishment of node-naming conventions. Only nodes whose names correspond to at least one of the stipulated conventions can be added or joined. For information, see Restrict Node-Addition.
Rebalance, Removal, Failover, and Recovery
Rebalance is a process of re-distributing data, indexes, event processing, and query processing among available nodes. This process should be run whenever a node is added to or removed from an existing cluster as part of a scheduled or pre-planned maintenance activity. It can also be run after a node has been taken out of the cluster by means of failover, described below. Rebalance takes place while the cluster is running and servicing requests: clients continue to read and write to existing structures, while the data is being moved between Data Service nodes. Once data-movement has completed, the updated distribution is communicated to all applications and other relevant consumers. See Rebalance, for more information.
Node removal allows a node to be taken out of a cluster in a highly controlled fashion, using rebalance to redistribute data, indexes, event processing, and query processing among remaining nodes. It is to be used only when are nodes in the cluster are responsive. It can be used on any node. See Removal, for more information.
Failover is the process by which a cluster-node can be removed; either proactively, to support required maintenance, or reactively, in the event of an outage. Two types of failover are supported, which are graceful (for Data Service nodes only) and hard (for nodes of any kind). Both types can be applied manually when needed. Hard can also be applied automatically, by means of prior configuration: in which case it becomes known as automatic failover. See Failover, for more information.
Recovery allows a previously failed-over node to be added back into its original cluster, by means of the rebalance operation. Full recovery involves removing all pre-existing data from, and assigning new data to, the node that is being recovered. Delta recovery maintains and resynchronizes a node’s pre-existing data. See Recovery, for more information.
Node Quantity
For production purposes, clusters of less than three nodes are not recommended. For information, see About Deploying Clusters with Less than Three Nodes.
Naming Clusters and Nodes
Clusters and the individual nodes they contain must be named. Names can always be specified when a cluster is first created, and when nodes are added to it. In some cases, names can be modified subsequently. All associated conventions and constraints are described below.
Naming when Creating a Single-Node Cluster
When a cluster is first created, it is necessarily a single-node cluster. The new cluster requires two names:
-
A cluster name. Once defined, this provides a convenient, verbal reference, which will never be used in programmatic or networked access. The name can be of any length, can make use of any symbols (for example:
%
,$
,!
,#
), and can include spaces. The name can be changed at any time during the life of the cluster, irrespective of the cluster’s configuration. -
A node name. This will be used in programmatic and networked access: indeed, all the other nodes in the cluster will access this node by means of this name; which must be one of the following:
-
The IP address of the underlying host. This can be of either the IPv4 or IPv6 family.
-
A fully qualified hostname that corresponds, in the appropriate network maps, to the IP address of the underlying host.
-
The loopback address,
127.0.0.1
. This is the default.
Whichever kind of node name is specified for the single-node cluster, if calls are made to the cluster by means of the Couchbase CLI or the REST API, those made from the underlying host can use the loopback address, the IP address of the underlying host, or the hostname of the underlying host, if one has been assigned. Calls made from other hosts on the network must use either the IP address or the hostname. In all cases, the appropriate port number must also be specified, following the name, separated by a colon.
-
Note that in Couchbase Enterprise Server 7.2 and later, when certificates are used for cluster authentication, each node certificate must be configured with the node-name correctly specified as a Subject Alternative Name (SAN). For information, see Node Certificate Validation.
Specifying the Cluster Name
The cluster name can be specified by means of:
-
Couchbase Web Console: either during the configuration of the single-node cluster, by means of the New Cluster dialog, as described in Create a Cluster; or subsequent to cluster-creation, by means of the General Settings screen.
-
The Couchbase CLI: either during configuration, by means of the command cluster-init; or subsequently, by means of the command setting-cluster.
-
The Couchbase REST API: either during configuration or subsequently. See Creating a New Cluster.
Specifying the Node Name
The node name can be specified for a single-node cluster by means of:
-
Couchbase Web Console: during configuration, by means of the Configure screen, as described in Create a Cluster. No subsequent, direct change to the node-name can be made by means of Couchbase Web Console: although the default loopback address can be indirectly changed, through node-addition; as described below.
-
The Couchbase CLI: during configuration or subsequently (provided that the cluster is still a single-node cluster), by means of the
--node-init-hostname
parameter to the command node-init. See Node-Renaming, below. -
The Couchbase REST API: either during configuration or subsequently (provided that the cluster is still a single-node cluster). See both Creating a New Cluster and Node-Renaming, immediately below.
Node Renaming
Node-renaming is permitted only for single-node clusters. A node-name cannot be changed after the node has become a member of a multi-node cluster. Therefore, if it becomes necessary to change the name of such a node, the node must be removed from the cluster; and then re-added to the cluster, following its name-change.
Node-Naming when Creating a Multi-Node Cluster
When an already provisioned node is to be added to an existing, single-node cluster, the new node must be referenced by means of either the IP address or the hostname of the underlying host. Once added, the new node is named in accordance with that reference. For information on node-addition by means of the UI, the CLI, and the REST API, see Add a Node and Rebalance.
When a new node, prior to its provisioning, is to be joined to the existing, single-node cluster, it must reference the single-node cluster by means of either the IP address or the hostname of the single-node cluster’s underlying host. The new node gets automatically named with the IP address of its own underlying host. For information on joining a cluster, see Join a Cluster and Rebalance.
When a new node is either added or joined to an existing, single-node cluster, and the original node was named with the default, loopback address, the original node is automatically renamed with the IP address of its underlying host. (Specifically, the original node opens a connection to the new node, determines the interface it is using for the source port, and adopts the name that corresponds to that interface.) This name-change persists even in the event that the addition of the second node, when initiated by means of Couchbase Web Console, is subsequently cancelled prior to the required, concluding rebalance.
Node-Naming with Hostnames
In consequence of the procedures and constraints described above, should it be necessary to ensure that each node in a cluster is named with the hostname (rather than the IP address) of its underlying host:
-
The original node should be named with the hostname of its underlying host while still a single-node cluster: this being the only time that the hostname can be specified as its name.
-
Nodes should never be joined to the cluster: they should only be added; with the hostname of their underlying host being used as their reference.
Restarting Nodes
If a node is restarted, Couchbase Server continues to use the specified hostname. Note, however, that if the node is failed over or removed, Couchbase Server will no longer use the specified hostname: therefore, in such circumstances, the node must be reconfigured, and the hostname re-specified.
Node Certificates
As described in Certificates, Couchbase Server can be protected by means of x.509 certificates; ensuring that only approved users, applications, machines, and endpoints have access to system resources; and that clients can verify the identity of Couchbase Server.
Certificate deployment for a cluster requires that the chain certificate chain.pem and the private node key pkey.key be placed in an administrator-created inbox folder, for each cluster-node. It subsequently requires that the root certificate for the cluster be uploaded, and then activated by means of reloading, for each node. If an attempt is made to incorporate a new node into the certificate-protected cluster without the new node itself already having been certificate-protected in this way, the attempt fails.
Therefore, a new node should be appropriately certificate-protected, before any attempt is made to incorporate it into a certificate-protected cluster.
Note also that in Couchbase Enterprise Server Version 7.2+, the node-name must be correctly identified in the node certificate as a Subject Alternative Name. If such identification is not correctly configured, failure may occur when uploading the certificate, or when attempting to add or join the node to a cluster. For information, see Node Certificate Validation.
See Certificates for an overview of certificates in the context of Couchbase Server. For information on configuring server certificates, see Configure Server Certificates; and in particular, the section Adding New Nodes.
Node-to-Node Encryption
Couchbase Server supports node-to-node encryption, whereby network traffic between the individual nodes of a cluster is encrypted, in order to optimize cluster-internal security. For an overview, see Node-to-Node Encryption. For practical steps towards set-up, see Manage Node-to-Node Encryption.