View distribution

Couchbase Server uses hash-partitioning to distribute data evenly across the cluster. For example, in a cluster with four nodes, on an average each node will contain about 25% of active data.

When using views, the view indexer runs locally on each node and indexes the subset of data that resides on that node. The partial view generated contains about 25% of the view results. Views use a process called scatter-gather. View data is scattered and distributed evenly across the cluster nodes.

When you query a view, the request is sent to a randomly selected cluster node with the data service, and is forwarded to all data nodes in the cluster. That’s the scatter piece. Depending on the options specified when querying the view, each node will either send the most current data in its partial view index, update the partial view index and send the data, or send the data and update the partial view index. The view results corresponding to each node are collected together and collated before the aggregate result is returned to the client.

views distribution
Figure 1. View distribution

View distribution works well for a small dataset and a small number of nodes. However, for large amounts of data, the fan-out for the scatter-and-gather will be huge and will involve transferring large amounts of data.