Managing Connections Using the Python SDK with Couchbase Server

This section describes how to connect the Python SDK to a Couchbase cluster and bucket. It contains best practices as well as information on the connection string, SSL and other advanced connection options.

You can specify additional options when connecting to the cluster by using the connection string. It indicates to the client where cluster nodes may be found and how to connect to them. Note that it is common to other Couchbase SDKs as well as the command-line client. The connection string uses a URI-like format familiar to what is used in other database systems.

Connecting to a Bucket

All Couchbase Server-resources are protected by Role-Based Access Control (RBAC). Therefore, to connect to a bucket, you must pass appropriate credentials: these consist of a username and password that correspond to a user currently defined on Couchbase Server. This user-definition is associated with one or more roles, each of which in turn corresponds to a set of privileges. When authentication occurs, Couchbase Server checks to ensure that the authenticated user’s privileges do permit access to the requested resource: if they do not, access is denied.

The following code demonstrates how authentication can be managed:

from couchbase.cluster import Cluster
from couchbase.cluster import PasswordAuthenticator
cluster = Cluster('couchbase://localhost')
authenticator = PasswordAuthenticator('username', 'password')
cluster.authenticate(authenticator)
cb = cluster.open_bucket('bucket-name')

For more information on authentication and RBAC, see Authentication.

Note that a bucket-connection closes once it falls out of scope and has no other objects referencing it. It is advised to create only a single bucket object per application (or thread) for each Couchbase bucket your application connects to.

Disconnecting from a Bucket

A Bucket object will disconnect from the cluster when it falls out of scope; that is, when it is no longer being referenced by anything in your Python application. You can still close a bucket manually via the synchronous API using the Bucket._close() method. This method should only be used as a last resort, as it may cause unexpected behavior if there are pending operations on the object.

Scalability and concurrency

Creating a new Bucket object is relatively expensive, and keeping many idle Bucket objects will negatively impact server performance (if done at a large scale).

If using an asynchronous framework (such as Gevent or Twisted), your application will require only one Bucket instance per Couchbase bucket. Likewise, if your Python application only contains a single thread then you need establish only a single Bucket object per Couchbase bucket.

If using multiple Python threads, it may be possible to share a single Bucket object across multiple threads (using locking). The exact number of Bucket objects to be created will depend on the activity pattern of the application. It is recommended to start off and develop with a single object for all threads. If you realize that your application is suffering in performance because of threads waiting for the Bucket object’s lock, you may implement a form of pooling or sharing so that n number of buckets be available.

If using multiple processes (such as with the multiprocessing module), or using a Python module which creates multiple processes, ensure that the Bucket object is not created in the parent process! Your Python interpreter may crash if the same Bucket object exists in more than a single process.

Connecting with SSL

You can specify additional options when connecting to the cluster by using the connection string. It indicates to the client where cluster nodes may be found and how to connect to them. Note that it is common to other Couchbase SDKs as well as the command-line client. The connection string uses a URI-like format familiar to what is used in other database systems.

Couchbase Sever features the ability to have clients communicate securely via SSL.

To use SSL, you need Couchbase Server Enterprise 3.0 or later (not available in the Community Edition).

  1. Obtain the SSL certificate used by the Cluster

  2. Make the certificate available to the file system of the client host.

  3. Employ the couchbases:// scheme for the connection string.

  4. Specify the local path to the certificate as the value for the certpath field.

To connect to a bucket on an SSL-enabled Cluster at the node 10.3.4.33, with the certificate saved as /var/cbcert.pem:

couchbases://10.3.4.33?certpath=/var/cbcert.pem

Specifying Multiple Hosts

You can specify multiple hosts in the connection string so that the client may be able to connect even if the cluster topology changed. To specify multiple hosts, separate them using a comma:

couchbase://host1.com,host2.com,host3.com

See Failure Considerations for the C (libcouchbase) SDK in Couchbase for more information about handling cluster topology changes.

You are not required to enumerate or pass all Couchbase cluster nodes to the client. The client only needs to know about a single node which is a member of the cluster. Once the client has connected to the node, it will query that node about the cluster topology, which in turn contains information about all Couchbase nodes and the services they contain.

Using DNS SRV records

As an alternative to specifying multiple hosts in your program, you can get the actual bootstrap node list from a DNS SRV record. The following steps are necessary to make it work:

  1. Set up your DNS server to respond properly from a DNS SRV request.

  2. Enable it on the SDK and point it towards the DNS SRV entry.

Your DNS server should be set up like this (one row for each bootstrap node):

_couchbase._tcp.example.com.  3600  IN  SRV  0  0  0  node1.example.com.
_couchbase._tcp.example.com.  3600  IN  SRV  0  0  0  node2.example.com.
_couchbase._tcp.example.com.  3600  IN  SRV  0  0  0  node3.example.com.
The ordering, priorities, ports and weighting are completely ignored and should not be set on the records to avoid ambiguities.

If you plan to use secure connections, you use _couchbases instead:

_couchbases._tcp.example.com.  3600  IN  SRV  0  0  0  node1.example.com.
_couchbases._tcp.example.com.  3600  IN  SRV  0  0  0  node2.example.com.
_couchbases._tcp.example.com.  3600  IN  SRV  0  0  0  node3.example.com.

In the above example, you would specify couchbase://example.com as the bootstrap host, and the library would check for the record. If no such record exists, it will treat example.com as an ordinary bootstrap node and try to bootstrap from it. Note that if you pass more than one bootstrap host, DNS SRV lookup will not be attempted, and the hosts will be interepreted as normal Couchbase nodes.

Configuration Cache

In environments when lots of short-lived connections are made to Couchbase (for example, a small command-line utility or a fork-and-execute CGI application) the overhead in actually bootstrapping the client may be significant. This is because the client must retrieve the configuration from the cluster, and involves several additional TCP requests and in many cases an additional TCP connection.

You can bypass the initial network bootstrap phase by using the config_cache directive in the connection string. The config_cache option accepts a path to a local file (the file should not exist when using for the first time). When performing the bootstrap process, the client will first check the contents of the given file to see if it contains an existing cluster configuration, and if it does, will use the file as the bootstrap source. If the file does not contain a configuration the client will then retrieve the configuration from the network and then write it to the file, so that future attempts will use the configuration file.

The config_cache feature is intended only for short-lived connections. During a cluster-side topology change the client will need to retrieve the configuration from the network as the file-based configuration will become invalid.

Additional Options

You can pass additional options in the connection string using the URL query format: couchbase://location-info?option1=value1&option2=value2&optionN=valueN. A list of options may be found in Client Settings

From Couchbase Python Client 2.5.1, AlternateAddress is implemented, for connecting to nodes in a NATed environment, such as Docker containers using portmapping. It is on by default, if the server provides a topology that includes a multi-network configuration. Whichever network is selected at bootstrap will be logged.

If using Docker Swarm, or otherwise running the SDK inside the NAT, then you will want to disable with ?network=default in the connection string, or an environmental setting can be made.

Note that any SSL/TLS certificates must be set up at the point where the connections are being made. The Couchbase SDKs will honor any valid SSL/TLS certificates.