Analytics using the Java SDK

Parallel data management for complex queries over many records, using a familiar N1QL-like syntax.

For complex and long-running queries, involving large ad hoc join, set, aggregation, and grouping operations, Couchbase Data Platform introduces the Couchbase Analytics Service (CBAS). After familiarising yourself with our introductory primer, in particular creating a dataset and linking it to a bucket to shadow the operational data, try Couchbase Analytics using the Java SDK.

Availability

The analytics service is available in Couchbase Data Platform 6.0 and later (developer preview in 5.5). While earlier Java SDK versions provide some support, we strongly recommend to use at least version 2.7.0, which provides a committed and stable interface for it.

Usage: Performing a Request

Intentionally, the API for analytics is very similar to the query service one. Right now it is only available on the Bucket (and AsyncBucket):

/**
 * Query with the default timeout.
 */
AnalyticsQueryResult query(AnalyticsQuery query);

/**
 * Query with a custom timeout.
 */
AnalyticsQueryResult query(AnalyticsQuery query, long timeout, TimeUnit timeUnit);

The timeout is always propagated to the server, so when a timeout happens on the client side the server can also stop processing the request and save resources.

To perform a query, you need to create an AnalyticsQuery — which can either be simple or be parameterized. If parameters are used, they can either be positional or named. Here is one example of each:

AnalyticsQuery simple = AnalyticsQuery.simple(
    "select airportname, country from airports where country = 'France'"
);

AnalyticsQuery positional = AnalyticsQuery.parameterized(
    "select airportname, country from airports where country = ?",
    JsonArray.from("France")
);

AnalyticsQuery named = AnalyticsQuery.parameterized(
    "select airportname, country from airports where country = $country",
    JsonObject.create().put("country", "France")
);

Additional options are available at query time which can be passed in through the AnalyticsParams object:

Table 1. Analytics Params Reference
Name Builder Default Description

Client Context ID

withContextId(String)

Random UUID

Sets a context ID that is returned back as part of the result.

Server Side Timeout

serverSideTimeout(long, TimeUnit)

Analytics timeout set on the client (75s)

Customizes the timeout sent to the server. Usually does not have to be set because the client sets it based on the timeout on the operation.

Pretty

pretty(boolean)

false

If the returned result should be prettified JSON.

Priority

priority(boolean)

false

If this request should have priority over others.

Raw Param

rawParam(String, Object)

none

Allows to send arbitrary params to the analytics service which are not part of the builder API.

These params must be sent as part of the query:

AnalyticsQuery q = AnalyticsQuery.simple(
    "select airportname, country from airports where country = 'France'",
    AnalyticsParams.build().priority(true)
);

Usage: Handling the Response

Once the request has been executed, results are sent back to the client and it will return an AnalyticsQueryResult (or its asynchronous counterpart on the AsyncBucket):

AnalyticsQueryResult result = bucket.query(query);

The result contains all kinds of actual data and metadata which might or might not be set, depending on the query response. Before attempting to iterate the rows, it is usually a good idea to check with finalSuccess() if the result was successful. Here is an example:

AnalyticsQueryResult result = bucket.query(AnalyticsQuery.simple(
    "SELECT airportname, country FROM airports WHERE country = 'France' LIMIT 5"
));
if (result.finalSuccess()) {
    for (AnalyticsQueryRow row : result) {
        System.out.println(row.value());
    }
}

Instead of iterating the result directly, you can consume either rows() through an Iterator, or allRows() which returns them as a List.

Should finalSuccess() return false, the specific error can be found in the errors() result. Since there might be more than one error it returns a List<JsonObject>. Additional metrics can be accessed through the info() method, such as elapsedTime or the resultCount.