Document

Couchbase supports CRUD operations, various data structures, and binary documents.

Although query and path-based (Sub-Document) services are available, the simplicity of the document-based kv interface is the fastest way to perform operations involving single documents.

Document

A document refers to an entry in the database (other databases may refer to the same concept as a row). A document has an ID (primary key in other databases), which is unique to the document and by which it can be located. The document also has a value which contains the actual application data.

Document IDs are assigned by application. A valid document ID must:

  • Conform to UTF-8 encoding

  • Be no longer than 250 bytes

    There is a difference between bytes and characters: most non-Latin characters occupy more than a single byte.

You are free to choose any ID for your document, so long as it conforms to the above restrictions. Unlike some other database, Couchbase does not automatically generate IDs for you, though you may use a separate counter to increment a serial number.

The document value contains the actual application data; for example, a product document may contain information about the price and description. Documents are usually (but not always) stored as JSON on the server. Because JSON is a structured format, it can be subsequently searched and queried.

{
    "type": "product",
    "sku": "CBSRV45DP",
    "msrp": [5.49, "USD"],
    "ctime": "092011",
    "mfg": "couchbase",
    "tags": ["server", "database", "couchbase", "nosql", "fast", "json", "awesome"]
}

Primitive Key-Value Operations

upsert(docid, document)
insert(docid, document)
replace(docid, document)
get(docid)
remove(docid)

In Couchbase documents are stored using one of the operations: upsert, insert, and replace. Each of these operations will write a JSON document with a given document ID (key) to the database. The update methods differ in behavior in respect to the existing state of the document:

  • insert will only create the document if the given ID is not found within the database.

  • replace will only replace the document if the given ID already exists within the database.

  • upsert will always replace the document, ignoring whether the ID already exists or not.

Documents can be retrieved using the get operation, and finally removed using the remove operations.

Since Couchbase’s KV store may be thought of as a distributed hashmap or dictionary, the following code samples are explanatory of Couchbase’ update operations in pseudo-code:

map<string,object> KV_STORE;

void insert(string doc_id, object value) {
    if (!KV_STORE.contains(doc_id)) {
        KV_STORE.put(doc_id, value);
    } else {
        throw DocumentAlreadyExists();
    }
}

void replace(string doc_id, object value) {
    if (KV_STORE.contains(doc_id)) {
        KV_STORE.put(doc_id, value);
    } else {
        throw DocumentNotFound();
    }
}

void upsert(string doc_id, object value) {
    KV_STORE.put(doc_id, value);
}

object get(string doc_id) {
    if (KV_STORE.contains(doc_id)) {
        return KV_STORE.get(doc_id);
    } else {
        throw DocumentNotFound();
    }
}

You can also use N1QL Queries and Full Text Search to access documents by means other than their IDs, however these query operations Couchbase eventually translate into primitive key-value operations, and exist as separate services outside the data store.

If you wish to only modify certain parts of a document, you can use sub-document operations which operate on specific subsets of documents:

collection.mutate_in("customer123", [SD.upsert("fax", "311-555-0151")])

or N1QL UPDATE to update documents based on specific query criteria:

update `default` SET sale_price = msrp * 0.75 WHERE msrp < 19.95;

Retrieving Documents

FASTPATH: This section discusses retrieving documents using their IDs, or primary keys. Documents can also be accessed using secondary lookups via N1QL queries and view-queries-with-sdk.adoc. Primary key lookups are performed using the key-value API, which simplifies use and increases performance (as applications may interact with the KV store directly, rather than a secondary index or query processor).

In Couchbase, documents are stored with their IDs. Retrieving a document via its ID is the simplest and quickest operation in Couchbase.

>>> result = cb.get('docid')
>>> print result.value
{'json': 'value'}
$ cbc cat docid
docid                CAS=0x8234c3c0f213, Flags=0x0. Size=16
{"json":"value"}

Once a document is retrieved, it is accessible in the native format by which it was stored; meaning that if you stored the document as a list, it is now available as a list again. The SDK will automatically deserialize the document from its stored format (usually JSON) to a native language type. It is possible to store and retrieve non-JSON documents as well, using a transcoder.

You can also modify a document’s expiration time while retrieving it; this is known as get-and-touch and allows you to keep temporary data alive while retrieving it in one atomic and efficient operation.

Documents can also be retrieved with N1QL. While N1QL is generally used for secondary queries, it can also be used to retrieve documents by their primary keys (ID) (though it is recommended to use the key-value API if the ID is known). Lookups may be done either by comparing the META(from-term).id or by using the USE KEYS [...] keyword:

SELECT * FROM default USE KEYS ["docid"];

or

SELECT * FROM default WHERE META(default).id = "docid";

You can also retrieve parts of documents using sub-document operations, by specifying one or more sections of the document to be retrieved

name, email = cb.retrieve_in('user:kingarthur', 'contact.name', 'contact.email')