Data Model
Couchbase’s use of JSON as a storage format allows powerful search and query over documents. Several data structures are supported by the SDK, including map, list, queue, and set.
The power to search, query, and easily work with data in Couchbase, comes from the choice of JSON as a storage format. Non-JSON storage is supported — see the Binary Storage Documentation — including UTF-8 strings, raw sequences of bytes, and language specific serializations, however, only JSON is supported by Query. In Couchbase, JSON’s key-value structure allows the storage of collection data structures such as lists, maps, sets and queues — see below. JSON’s tree-like structure allows operations against specific paths in the Document, and efficient support for these data structures.
Data Structures
The Rust SDK’s collection_ds contains four data structures that provide high-level data structure abstractions:
-
List is a sequential data structure. Values can be placed in the beginning or end of a list, and can be accessed using numeric indexes.
-
Map is a key-value structure, where a value is accessed by using a key string.
-
Queue offers First In First Out (FIFO) semantics, allowing it to be used as a lightweight job queue.
-
Set is an unordered set of unique values.
These data structures are stored as JSON documents in Couchbase, and can therefore be accessed both using the Query Service and normal key-value operations (including sub-document operations).
Using the data structures API may help your application in two ways:
-
Simplicity: Data structures provide high level operations by which you can deal with documents as if they were container data structures.
-
Efficiency: Data structure operations don’t transfer the entire document across the network. Only the relevant data is exchanged between client and server, allowing for less network overhead and shorter latency.
Data and Good Schema Design
Most operations are performed at the collection or scope level (although legacy bucket-level ops are often available), and keeping documents in the same collection can make for speedier indexing and queries — whether SQL++ or Search.
The Server enforces no schema, enabing evolutionary changes to your data model that reflect changes in the real world. The schema-on-read approach allows the client software that you write with the SDK to work with changes to an implicit schema, and allows heterogeneous data.
Objects, Relations, Tables
In the Relational Database (RDBMS) world, a translaton layer is often used between the objects of your data model in your application, and the tables that you store the data in. JSON storage allows you to store complex types, like nested records and arrays, without decomposing them to a second table (known in the SQL world as database normalization).
When the relational model was proposed, more than 50 years ago, limitations in available computer resources meant that removing data duplication in one-to-many and many-to-many relationships this way made a great deal of sense. There is still a case to be made for it for reducing inconsistencies — the difference with a document database is that you get to choose when to do this.
Collections and Scopes
Couchbase’s atomic units of data are documents, stored as key-value pairs. The value can be anything, but storing in JSON format enables indexing, searching, and many useful ways of working with the data from the SDK.
Collections are arbitary groupings of the data documents. Ones that suit your object model. For example, one collection of students enrolled at the college and one collection of courses available for them to take. Notionally you may view them as equivalent to an RDBMS table — but it’s up to you.
Within a bucket, you can organize your collections into scopes — some methods are available at the bucket level, but Search and Query Services favour Scope-level indexing and querying for greater efficiency.