MapReduce Views Using the Java SDK with Couchbase Server

You can use MapReduce views to create queryable indexes in Couchbase Server.

The normal CRUD methods allow you to look up a document by its ID. A MapReduce (view query) allows you to look up one or more documents based on various criteria. MapReduce views are comprised of a map function that is executed once per document (this is done incrementally, so this is not run each time you query the view) and an optional reduce function that performs aggregation on the results of the map function. The map and reduce functions are stored on the server and written in JavaScript.

MapReduce queries can be further customized during query time to allow only a subset (or range) of the data to be returned.

See the Incremental MapReduce Views and Querying Data with Views sections of the general documentation to learn more about views and their architecture.

The following example is the definition of a by_name view in a "beer" design document. This view checks whether a document is a beer and has a name. If it does, it emits the beer’s name into the index. This view allows beers to be queried for by name. For example, it’s now possible to ask the question "What beers start with A?"

function (doc, meta) {
    if (doc.type && doc.type == "beer" && doc.name) {
        emit(doc.name, null);
    }
}

A Spatial View can instead be queried with a range or bounding box. For example, let’s imagine we have stored landmarks with coordinates for their home city (eg. Paris, Vienna, Berlin and New York) under geo, and each city’s coordinates is represented as two attributes, lon and lat. The following spatial view map function could be used to find landmarks within Europe, as a "by_location" view in a "spatial" design document:

function (doc, meta) {
    if (doc.type && doc.type == "landmark" && doc.geo) {
        emit([doc.geo.lon, doc.geo.lat], null);
    }
}

Querying Views through the Java SDK

Query a view through the Java client through the query(ViewQuery q) method on the Bucket class. This method returns a ViewResult whose iterator yields the results of the query (in the form of ViewRow objects). The ViewResult also exposes the list of rows (allRows()), the success() status and potential error(). The ViewRow object contains the key and value properties (which are the first and second arguments to the view’s emit() function, respectively) as well as the id property, which may be passed to the get() method to return the actual document. Alternatively, directly call the document() method on the view row.

Bucket bkt = CouchbaseCluster.create("192.168.33.101").openBucket("beer-sample");
ViewResult result = bkt.query(ViewQuery.from("beer", "by_name");
for (ViewRow row : result) {
    System.out.println(row); //prints the row
    System.out.println(row.document().content()); //retrieves the doc and prints content
}

You can also set various properties on the query:

Bucket bkt = CouchbaseCluster.create("192.168.33.101").openBucket("beer-sample");
ViewQuery q = ViewQuery.from("beer", "by_name")
    .limit(5) // Limit to 5 results
    .startKey("A")
    .endKey("A\u0fff");

ViewResult result = bkt.query(q);
for (ViewRow row : result) {
    System.out.println(row);
}

Here’s some sample output for the previous query:

DefaultViewRow{id=harvey_son_lewes-a_lecoq_imperial_extra_double_stout_1999, key=A. LeCoq Imperial Extra Double Stout 1999, value=null}
DefaultViewRow{id=harvey_son_lewes-a_lecoq_imperial_extra_double_stout_2000, key=A. LeCoq Imperial Extra Double Stout 2000, value=null}
DefaultViewRow{id=mickey_finn_s_brewery-abana_amber_ale, key=Abana Amber Ale, value=null}
DefaultViewRow{id=brasserie_lefebvre-abbaye_de_floreffe_double, key=Abbaye de Floreffe Double, value=null}
DefaultViewRow{id=brasserie_de_brunehaut-abbaye_de_saint_martin_blonde, key=Abbaye de Saint-Martin Blonde, value=null}

includeDocs: This parameter allows for eager retrieval of the document associated with each row.

It is only beneficial in the synchronous API (in the async API you could just call get(row.id()) on the async bucket with the same effects). It impacts the row’s document(...) method by preloading the return value of said method.

However, since the simple signature of row.document() assumes a JsonDocument, if you want a different document type you have to call both includeDocs() and document() with the desired target class: query.includeDocs(SomeDocumentClass.class) and row.document(SomeDocumentClass.class).

Note that the ViewQuery has a getter for the target class: includeDocsTarget().

Querying Geospatial Views

To query a geospatial view, you will need to construct a SpatialViewQuery object (com.couchbase.client.java.view.SpatialViewQuery). Spatial queries accept a startRange and an endRange parameter which allow you to limit the enclosing bounding boxes of the result. The arguments to these parameters are JsonArray with each element corresponding to a component emitted by the key (the first two components implicitly being the longitude and latitude of the result itself).

On output, spatial queries yield instances of SpatialViewRow classes. A SpatialViewRow is similar to a ViewRow, with an added geometry property.

Querying a spatial view
SpatialViewQuery q = SpatialViewQuery.from("spatial", "by_location")
    .startRange(JsonArray.from(0, -90, null))
    .endRange(JsonArray.from(180, 90, null));
SpatialViewResult result = bkt.query(q);

for (SpatialViewRow row : result) {
    System.out.println("Key:" + row.key());
    System.out.println("Value:" + row.value());
    System.out.println("Geometry:" + row.geometry());
}

SpatialView also has the includeDocs() parameter to preload the document for the SpatialViewRow's document() method.

View results details

For all types of views, a ViewResult is always returned, which contains zero to many ViewRows. In addition to iterative row access, more methods are available on the result:

Method Description

List<ViewRow> allRows()

Accumulates all returned rows in a List and returns it.

Iterator<ViewRow> rows()

Provides iterative access to rows as they arrive.

int totalRows()

The total number of rows in the index can be greater than the number of rows() returned.

boolean success()

True if the query was successful, false otherwise. Check error() if so.

JsonObject error()

Contains the error if the query was not successful or null otherwise.

JsonObject debug()

Contains debug information if debug() was enabled on the query, null otherwise.

The only difference between regular and spatial view results is the fact that spatial ones do not expose the number of totalRows.

ViewQuery API details

All options shown here are available on the ViewQuery in a fluent API manner. All of them are optional, so only when they are explicitly provided, they will alter the behavior of the query.

As a general note, all arguments that accept JSON are provided with a higher number of method overloads to accommodate all combinations in a type-safe manner.

Method Accepted Types Description

development

boolean

When true queries the development view, false by default.

reduce

boolean

Explicitly enables/disables the reduce function on the query. If not provided and the view has a reduce function, it will be used.

limit

int

Limits the number of the returned documents to the specified number.

skip

int

Skips the given number of records before starting to return the results.

group

boolean

Groups the results using the reduce function to a group or single row.

groupLevel

int

Specifies the group level to be used.

inclusiveEnd

boolean

Whether the specified end key should be included in the result.

stale

Stale.TRUE, Stale.FALSE, Stale.UPDATE_AFTER (default)

Defines how stale the view results are allowed to be in the query.

debug

boolean

Enabled debugging on view queries.

onError

OnError.STOP (default), OnError.CONTINUE

Sets the response in the event of an error.

descending

boolean

Returns the documents in descending order by key if true, default is false.

key

JSON

The exact key to return from the query.

keys

JsonArray

Only the given matching keys will be returned.

startKeyDocId

String

Where to start searching for the key range. Can be used for efficient pagination.

endKeyDocId

String

Where to stop searching for the key range.

startKey

JSON

The key where the row return range should start.

endKey

JSON

The key where the row return range should end.

includeDocs

boolean, optional Class<? extends Document>

Wether or not to automatically fetch the document corresponding to each row. The second parameter is the target class for the document, JsonDocument if omitted.

This method is needed only when using the blocking API since on the async API there is no benefit over just calling .document() in the stream.

See note on includeDocs below.

Important when using Grouping:group(boolean) and groupLevel(int) should not be used together in the same view query. It is sufficient only to set the grouping level only and use this setter in cases where you always want the highest group level implicitly.

SpatialViewQuery API details

All options shown here are available on the SpatialViewQuery in a fluent API manner. All of them are optional, so only when they are explicitly provided, they will alter the behaviour of the query.

Method Accepted Types Description

development

boolean

When true queries the development view, false by default.

limit

int

Limits the number of the returned documents to the specified number.

skip

int

Skips the given number of records before starting to return the results.

stale

Stale.TRUE, Stale.FALSE, Stale.UPDATE_AFTER (default)

Defines how stale the view results are allowed to be on query.

debug

boolean

Enabled debugging on view queries.

onError

OnError.STOP (default), OnError.CONTINUE

Sets the response in the event of an error.

startRange

JsonArray

Where the spatial range should start. Can be multidimensional.

endRange

JsonArray

Where the spatial range should end. Can be multidimensional.

range

JsonArray, JsonArray

Convenience method to combine start and endrange in one argument.

includeDocs

boolean, optional Class<? extends Document>>

Weather or not to automatically fetch the document corresponding to each row. The second parameter is the target class for the document, JsonDocument if omitted.

This method is needed only when using the blocking API since on the async API there is no benefit over just calling .document() in the stream.

See note on includeDocs below.

Here is how to use the range parameter to find documents with a location within a bounding box. We have stored cities Paris, Vienna, Berlin and New York. Each city’s coordinates is represented as two attributes, lon and lat. The spatial view’s map function is:

function (doc) { if (doc.type == "city") { emit([doc.lon, doc.lat], null); } }

To query the view and find cities within Europe, we use Europe’s bouding box. The startRange is the most south-western point of the bounding box, the endRange is its most north-eastern point:

JsonArray EUROPE_SOUTH_WEST = JsonArray.from(-10.8, 36.59);
JsonArray EUROPE_NORTH_EAST = JsonArray.from(31.6, 70.67);

SpatialViewResult result = bucket.query(SpatialViewQuery.from("cities", "by_location")
            .stale(Stale.FALSE)
            .range(EUROPE_SOUTH_WEST, EUROPE_NORTH_EAST));
List<SpatialViewRow> allRows = result.allRows();

for (SpatialViewRow row : allRows) {
    System.out.println(row.id());
}

//prints:
//city::Vienna
//city::Berlin
//city::Paris

Retry Conditions

SDK retries view requests automatically on certain known conditions, which represented in the following table:

HTTP status code Behavior

200

Do not retry request.

300, 301, 302, 303, 307, 401, 408, 409, 412, 416, 417, 501, 502, 503, 504

Retry request.

404

In case the library detects yet unprovisioned node, it will retry. Otherwise, it will report ViewDoesNotExistException.

500

If the error payload reports missing view document or badly formed query, it will not retry. Otherwise, it will retry request.

All codes not listed in the table will not be retried by default. But the client code still can use retrying framework or write a custom handler. In the example below, it will retry 10 times if the view does not exist:

bucket.query(SpatialViewQuery.from("spatial", "test"))
      .retryWhen(
           RetryBuilder.anyOf(ViewDoesNotExistException.class)
                       .delay(Delay.exponential(TimeUnit.SECONDS, 1))
                       .max(10)
                       .build())
      .subscribe(new Action1<AsyncSpatialViewResult>() {
          @Override
          public void call(AsyncSpatialViewResult result) {
              // handle result
          }
      });