CRUD Document Operations Using the Python SDK with Couchbase Server

You can access documents in Couchbase using methods of the couchbase.bucket.Bucket object. The method for retrieving documents is get(), get_in() and retrieve_in()and the methods for mutating documents are upsert(), insert(), replace() and mutate_in().

Examples are shown using the synchronous API. Semantics of the synchronous API are easily translatable to the Twisted and Gevent APIs.

Document input and output types

The Python requires that document IDs be convertible to unicode objects. For document values, by default it requires that the document value be of a type serializable to json.dumps. See Formats and Non-JSON Documents below for more details

Return value for CRUD operations

All Python SDK operations return a Result object (or a subclass thereof). The result object contains general operation information and item metadata retrieved from the server. Typically a subclass of Result will be returned which also contains operation-specific result information, such as the value field for a get() operation. The most common fields in a Result object are:

Name Description

value

For retrieval-type operations, this field contains the value of the requested key.

rc

Contains the raw error code received from libcouchbase. If this number is zero then the operation was successful; otherwise it will be an error. The CouchbaseError.rc_to_exctype() class method can be used to return the exception class which would have been thrown.

success

This is a convenience property which is equivalent to rc == 0

cas

An opaque object representing the resulting CAS value of the key that was operated on. This value is not meant to be user facing, but should be passed directly to other operations for locking purposes.

Note that for most of the asynchronous APIs, these objects are not returned per se, but rather passed back to callbacks. For example, with Twisted, a Deferred object is returned, and the appropriate Result or Failure object is passed into the callback or errback, respectively.

Additional options

Additional options may be specified using keyword arguments

Update operations also accept a TTL (expiry) value (ttl) which will instruct the server to delete the document after a given amount of time. This option is useful for transient data (such as sessions). By default documents do not expire. See Expiration Overview for more information on expiration.

Update operations can also accept a CAS (cas) value to protect against concurrent updates to the same document. See CAS for a description on how to use CAS values in your application.

Creating and updating full documents

Documents may be created and updated using the Bucket.upsert(), Bucket.insert(), and Bucket.replace() family of methods. Read more about the difference between these methods at Primitive Key-Value Operations in the Couchbase developer guide.

These methods accept two mandatory arguments:

  • key: The ID of the document to modify. This should be a Python string or unicode object.

  • value: The desired new value of the document. This may be anything that can be serialized as JSON (other input types can also be specified, see Document input and output types).

Additional options can be specified to the operation:

  • cas: The CAS value for the document. If the CAS on the server does not match the CAS supplied to the method, the operation will fail with a couchbase.exceptions.KeyExistsError error. See Concurrent Document Mutations for more information on the usage of CAS values.

  • ttl: Specify the expiry time for the document. If specified, the document will expire and no longer exist after the given number of seconds. See Expiration Overview for more information.

  • format: Specify the format of the new value. This indicates how the value should be serialized before being sent to the cluster. By default only JSON-serializable objects may be supplied as values. See Document input and output types.

  • persist_to, replicate_to: Specify durability requirements for the operations. A value of -1 indicates that the specific requirement will be set to the maximum possible.

Upon success, the returned Result object will contain the new CAS value of the document. If the document was not mutated successfully, an exception is raised. See Handling Exceptions and Other Errors with the Python SDK in Couchbase for more information on exception types and how to handle them.

rv = bucket.insert('document_name', {'some': 'value'})
print rv

Output:

OperationResult<RC=0x0, Key=u'document_name', CAS=0x707339a4125aaa13>

If the document being inserted already exists, the client will raise a couchbase.exceptions.KeyExistsError. If your application simply wants to set the value ignoring whether it exists or not, use the upsert() method.

Retrieving full documents

Documents may be retrieved using the Bucket.get() method. The get() method has a single mandatory argument:

  • key: The document ID to retrieve

Other options include:

  • ttl: Set the expiration for the document. This operation is known as a get-and-touch operation. See Expiration Overview for more information.

  • replica: This may be passed as a boolean to issue a replica read. This may be used if access to the master/primary node is temporarily unavailable.

  • quiet: Suppress throwing exceptions if the document does not exist. Rather than throwing an exception, status can be obtained from the Result.success property.

    rv = bkt.get('maybe', quiet=True)
    if rv.success:
        handle_value(rv)
    else:
        print "Item not found"

Upon success, a ValueResult object is returned. The actual document may be access by using the ValueResult.value property. Additional properties may also be accessed from the returned object. See Return value for CRUD operations.

The ValueResult.value will contain a native Python object, deserialized from JSON (or another format, per Document input and output types).

If the document does not exist (and quiet=True was not specified), a couchbase.exceptions.NotFoundError will be raised.

rv = bucket.get('document_name')
print "Result object is:", rv
print "Actual value is:", rv.value

Sample output:

Result object is ValueResult<RC=0x0, Key=u'document_name', Value={u'some': u'value'},
   CAS=0x20504a5e6a5aaa13, Flags=0x2000000>
Actual value is {u'some': u'value'}

If the item does not exist, the client will raise a couchbase.exceptions.NotFoundError, which you can catch:

from couchbase.exceptions import NotFoundError
try:
    rv = bkt.get('NOTEXISTENT')
except NotFoundError as e:
    print "Item not found", e

Removing full documents

Documents may be removed using the Bucket.remove() method. This method takes a single mandatory argument:

  • key: The ID of the document to remove

Some additional options:

  • quiet: Do not raise an exception when attempting to remove a document which does not exist.

  • cas: Only remove the document if the CAS has not changed.

Modifying expiraton

Document expiration can be performed using the Bucket.touch() method.

cb.touch('document_id', ttl=5)

You can also set the ttl parameter for methods which support it:

cb.upsert('expires', "i'm getting old...", ttl=5)
print cb.get('expires').value
time.sleep(6)
print cb.get('expires').value
i'm getting old...
Traceback (most recent call last):
  File "exp.py", line 10, in <module>
    print cb.get('expires').value
  File "/usr/local/lib/python2.7/site-packages/couchbase/bucket.py", line 489, in get
    replica=replica, no_format=no_format)
couchbase.exceptions._NotFoundError_0xD (generated, catch NotFoundError): <Key=u'expires', RC=0xD[The key does not exist on the server], Operational Error, Results=1, C Source=(src/multiresult.c,309)>

Atomic document modifications

Additional atomic document modifications can be performing using the Python SDK. You can modify a counter document using the Bucket.counter() method. You can also use the Bucket.append and Bucket.prepend methods to perform raw byte concatenation

Batching Operations

Many operations can be batched in the Python SDK using their *_multi equivalent. For example, to batch multiple Bucket.get() calls, you would use Bucket.get_multi().

The various *_multi operations all return a MultiResult object which acts like a dictionary: it maps each individual key to its operation which was performed.

cb.upsert_multi({
    'foo': 'fooval',
    'bar': 'barval',
    'baz': 'bazval'})

for key, result in cb.get_multi(('foo', 'bar', 'baz')).items():
    print '{0}: {1.value}'.format(key, result)
baz: bazval
foo: fooval
bar: barval

You can use the Item API to pass additional per-operation options to multi methods.

Operating with sub-documents

Sub-Document API is available starting Couchbase Server version 4.5. See Sub-Document Operations for an overview.

Sub-document operations save network bandwidth by allowing you to specify paths of a document to be retrieved or updated. The document is parsed on the server and only the relevant sections (indicated by paths) are transferred between client and server. You can execute sub-document operations in the Python SDK using the lookup_in, mutate_in, and retrieve_in methods.

Each of these methods accepts a key as its mandatory first argument, followed by one or more command specifications specifying a specifying an operation and a document field operand. You may find all the operations in the couchbase.subdocument module.

import couchbase.subdocument as SD
res = cb.lookup_in('docid', SD.get('path.to.get'), SD.exists('check.path.exists'))
res = cb.mutate_in('docid', SD.upsert('path.to.upsert', value, create_parents=True), SD.remove('path.to.del'))

For simply retrieving a list of paths, you may use the retrieve_in convenience method:

res = cb.retrieve_in('docid', 'path1', 'path2', 'path3')

All sub-document operations return a special SubdocResult object which is a subclass of Result. In contrast with a normal Result object, a SubdocResult object contains multiple results with multiple statuses, one result/status pair for every input operation. You can access an individual result/status pair by addressing the SubdocResult object as a mapping, and then using either the index position or the path of the operation as the key:

res = cb.lookup_in('docid', SD.get('foo'), SD.exists('bar'), SD.exists('baz'))
# First result
res['foo']
# or
res[0]

Using the [] (getitem) functionality will raise an exception if the individual operation did not complete successfully. You can also use SubdocResult.get() to return a tuple of (errcode, value)

Formats and Non-JSON Documents

See Non-JSON Documents for a general overview of using non-JSON documents with Couchbase

All Python objects which can be represented as JSON may be passed unmodified to a storage function, and be received via the get method without any additional modifications. You can modify the default JSON encoders used by the Python SDK using the couchbase.set_json_converters function. This function accepts a pair of encode and decode functions which are expected to behave similarly to json.dumps and json.loads respectively.

Storage operations accept a format keyword argument which may be one of couchbase.FMT_JSON (to indicate the object should be serialized as JSON), couchbase.FMT_UTF8 (to serialize the object as a UTF-8 encoded string), couchbase.FMT_BYTES (to serialize an object as a raw set of bytes; note the Python object in question must be of type bytes), couchbase.FMT_PICKLE (to serialize an object using Python’s native pickle module). You may also define new formats and utilize them via a custom transcoder.

You can implement a custom transcoder if none of the pre-configured options are suitable for your application. A custom transcoder converts intputs to their serialized forms, and deserializes encoded data based on the item flags. The transcoder interface is described in the API documentation (http://pythonhosted.org/couchbase/api/transcoder.html), and an example (http://pythonhosted.org/couchbase/api/transcoder.html) is also provided in the source repository. When implementing a transcoder