A newer version of this documentation is available.

View Latest

Multidimensional Scaling

    Couchbase Server performs symmetric scaling to distribute the workload equally between nodes and using multidimensional scaling allows each of the workloads to scale independently.

    Couchbase Server, like other databases, optimizes your interaction with the data. However, different interactions with data can have competing needs on the computational resources. There are a number of database workloads:

    • Core data operations - These are the high velocity updates to your data.

    • Indexing operations - Indexing provides faster access to data. Indexes keep up with high velocity updates to the data while responding to query requests.

    • N1QL query execution - Queries reshape and summarize data in various ways. Depending on the query, query engine uses indexes or directly scans data to get the final result.

    • Full-text searching – Text searching requires a full text index and direct queries that can sort data based on relevance with full language awareness.

    Relational databases tightly couple these workloads and depend on scaling up as the workloads increase. NoSQL and big data platforms typically partition these workloads across their pool of machines using symmetric scale-out, giving each node an equal share of the work. Couchbase Server does this symmetric scaling as well.

    symmetric scaling with cbserver
    Figure 1. Symmetric scaling with Couchbase Server

    Symmetric scale-out is simple. However, there are some drawbacks to this architecture:

    • The core data operations, indexing, search, and query execution workloads receive concurrent requests and compete for the same computational resources when they reside on the same node. For example, as the velocity of updates to the data increases, indexing has to work harder to maintain the indexes, which in turn takes away resources from core data operations.

    • Scale-out works against query latencies as it results in a wider fan-out in workloads. As query workload increases, computing the new aggregates requires scatter-gather index lookups, which takes away resources from all the nodes at once making scale-out unfit as a scalability model.

    With Multidimensional Scaling (MDS), Couchbase Server allows each of the workloads to independently scale, thus addressing these issues. With MDS, some or all services can choose to independently scale up and scale out within the same Couchbase Server cluster. For example,

    • Core data operations can scale out and add more capacity to process data mutations. Core data operations is a perfectly partitionable workload and scale-out brings great benefits.

    • Query, search, and index services can scale-up with fewer nodes to eliminate scatter-gather and favor more local processing for queries. As an Administrator, you can fine tune the hardware resources for each service separately. You can add more cores to enable parallel query computation in query service and also add more memory to cache more indexes in the index service.

    independent scaling
    Figure 2. Independent scaling with Couchbase Server

    Multidimensional Scaling (MDS) separates the distinct workloads:

    • Data Service, which performs core data operations and Views in Couchbase Server. All clusters are required to have the core data service enabled.

    • Index Service, which performs indexing with Global Secondary Indexes (GSI). Indexing Service is optional.

    • Query Service, which performs N1QL query execution. Query Service is optional.

    • Search Service, which performs full text index maintenance and full text search queries. Search service optional.

    When you create a new cluster or add new nodes, as an Administrator you can choose to enable or disable the services on each node and create a topology.

    Independent service deployment with multidimensional scaling is only available in the Enterprise Edition of Couchbase Server. The Enterprise Edition also allows unlimited concurrency in query execution. Community Edition can use homogeneous scaling and limits concurrent execution of queries to ensure queries don’t starve the data operations to a maximum of 4 cores per node.