Analytics using the Python SDK

Parallel data management for complex queries over many records, using a familiar N1QL-like syntax.

For complex and long-running queries, involving large ad hoc join, set, aggregation, and grouping operations, Couchbase Data Platform introduces the Couchbase Analytics Service (CBAS). After familiarising yourself with our introductory primer, in particular creating a dataset and linking it to a bucket to shadow the operational data, try Couchbase Analytics using the Python SDK.

Availability

The analytics service is available in Couchbase Data Platform 6.0 and later (developer preview in 5.5). While earlier Python SDK versions provide some support, we strongly recommend to use at least version 2.5.0, which provides a committed and stable interface for it.

Usage: Performing a Request

Intentionally, the API for analytics is very similar to the query service one, but with an additional host parameter.

def analytics_query(self, query, host, *args, **kwargs):
    """
    Execute an Analytics query.
    ...

The timeout is always propagated to the server, so when a timeout happens on the client side the server can also stop processing the request and save resources.

To perform a query, you need to create an AnalyticsQuery — which can either be simple or parameterized. If parameters are used, they can either be positional or named. Here is one example of each:

from couchbase.analytics import AnalyticsQuery
simple = AnalyticsQuery(
    "select airportname, country from airports where country = 'France'")

positional = AnalyticsQuery(
    "select airportname, country from airports where country = ?",
    "France")


named = AnalyticsQuery(
    "select airportname, country from airports where country = $country",
    country="France")

Additional options are available at query time which can be passed in through the AnalyticsRequest object:

Table 1. Analytics Params Reference
Name Option Default Description

Pretty

pretty(boolean)

false

If the returned result should be prettified JSON.

These params must be sent as part of the query:

q = AnalyticsQuery(
    "select airportname, country from airports where country = 'France'"
)
q.set_option("pretty",True)

Usage: Handling the Response

Once the request has been executed, results are sent back to the client and it will return an AnalyticsRequest

analytics_host_port="localhost:8091"
result = bucket.analytics_query(query,analytics_host_port);

The result contains all kinds of actual data and metadata which might or might not be set, depending on the query response. Here is an example:

result = self.cb.analytics_query(AnalyticsQuery(
            "SELECT airportname, country FROM airports WHERE country = 'France' LIMIT 5"
        ), "localhost:8091")

        try:
            for row in result:
                print(row)
        except e:
            # handle exceptions
            ...