N1QL Queries from the SDK

    +
    You can query for documents in Couchbase using the N1QL query language, a language based on SQL, but designed for structured and flexible JSON documents. Querying can solve typical programming tasks such as finding a user profile by email address, facebook login, or user ID.

    Our query service uses N1QL, which will be fairly familiar to anyone who’s used any dialect of SQL. Further resources for learning about N1QL are listed at the bottom of the page. Before you get started you may wish to checkout the N1QL intro page, or just dive in with a query against our "travel-sample" data set. In this case, note that before you can query a bucket, you must define at least one index. You can define a primary index on a bucket. When a primary index is defined you can issue non-covered queries on the bucket as well.

    Use cbq, our interactive Query shell. Open it, and enter the following:

    CREATE PRIMARY INDEX ON `travel-sample`

    or replace travel-sample with a different Bucket name to build an index on a different dataset.

    The default installation places cbq in /opt/couchbase/bin/ on Linux, /Applications/Couchbase Server.app/Contents/Resources/couchbase-core/bin/cbq on OS X, and C:\Program Files\Couchbase\Server\bin\cbq.exe on Microsoft Windows.

    Queries & Placeholders

    Placeholders allow you to specify variable constraints for an otherwise constant query. There are two variants of placeholders: postional and named parameters. Positional parameters use an ordinal placeholder for substitution and named parameters use variables. A named or positional parameter is a placeholder for a value in the WHERE, LIMIT or OFFSET clause of a query. Note that both parameters and options are optional.

    Positional parameter example:
    result = cluster.query(
        "SELECT x.* FROM `default` WHERE x.Type=$1",
        'User')
    Named parameter example:
    result = cluster.query(
        "SELECT x.* FROM `default` WHERE x.Type=$type",
        type='User')

    The complete code for this page’s example can be found at n1ql_ops.py.

    The QueryMetaData provides insight into some basic profiling/timing information as well as information like the ClientContextID.

    Table 1. QueryMetaData
    Name Description

    RequestID string

    Returns the request identifer of this request.

    ClientContextID string

    Returns the context ID either generated by the SDK or supplied by the user.

    Status QueryStatus

    An enum simply representing the state of the result.

    Metrics QueryMetrics

    Returns metrics provided by the query for the request if enabled.

    Signature interface{}

    If a signature is present, it will be available to consume in a generic fashion.

    Warnings []QueryWarning

    Non-fatal errors are available to consume as warnings on this method.

    Profile interface{}

    If enabled returns additional profiling information of the query.

    Handling Results

    In most cases your query will return more than one result, and you may be looking to iterate over those results:

    
    result = cluster.query(
        "SELECT x.* FROM `default` WHERE x.Type=$1",
        'User')
    
    # iterate over rows
    for row in result:
        # each row is an instance of the query call
        name = row['username']
        age = row['age']

    Query Options

    The query service provides an array of options to customize your query. The following table lists them all:

    Table 2. Available Query Options
    Name Description

    client_context_id (str)

    Specifies a context ID string which is mirrored back from the query engine on response.

    consistent_with (MutationState)

    Specifies custom scan consistency through “at_plus” with mutation state token vectors.

    max_parallelism (int)

    The maximum number of logical cores to use in parallel for this query.

    positional_parameters (Iterable[JSON])

    Specifies the parameters used in the query, when positional notation ($1, $2, etc…) is used.

    named_parameters (dict[str,JSON])

    Specifies the parameters used in the query, when named parameter notation ($foo, $bar, etc…) is used.

    pipeline_batch (int)

    Specifies pipeline batching characteristics.

    pipeline_cap (int)

    Specifies pipeline cap characteristics.

    profile (QueryProfile)

    Specifies the profiling level to use.

    raw (dict[str,JSON])

    This is a way to to specify the query payload to support unknown commands and be future-compatible.

    scan_wait (timedelta)

    Specifies maximum amount of time to wait for a scan.

    scan_cap (int)

    Specifies the scan cap characteristics.

    metrics (bool)

    Specifies whether or not to include metrics with the QueryResult.

    timeout (timedelta)

    Uses this timeout value, rather than the default for the cluster.

    read_only (bool)

    Hint to the server that this is a read-only query.

    scan_consistency (QueryScanConsistency)

    Specify the level of consistency for the query. Overrides any setting in consistent_with. Can be either NOT_BOUNDED(), which means return what is in the index now, or REQUEST_PLUS(), which means you can read your own writes. Slower, but when you need it you have it.

    adhoc (bool)

    Specifies if the prepared statement logic should be executed internally.

    Scan Consistency

    Setting a staleness parameter for queries, with scan_consistency, enables a tradeoff between latency and (eventual) consistency.

    • A N1QL query using the default Not Bounded scan consistency will not wait for any indexes to finish updating before running the query and returning results, meaning that results are returned quickly, but the query will not return any documents that are yet to be indexed.

    • With Scan Consistency set to RequestPlus, all document changes and index updates are processed before the query is run. Select this when consistency is always more important than performance.

    • For a middle ground, AtPlus is a "read your own write" (RYOW) option, which means it just waits for the new documents that you specify to be indexed, rather than an entire index of multiple documents.

    ScanConsistency (RYOW)
    
    # create / update document (mutation)
    upsert_result = collection.upsert("id",  dict( name = "Mike", type = "User" ))
    
    # create mutation state from mutation results
    state = MutationState()
    state.add_results(upsert_result)
    
    # use mutation state with query option
    result = cluster.query("SELECT x.* FROM `default` WHERE x.Type=$1",
        'User', QueryOptions(consistent_with=state))

    Streaming Large Result Sets

    By default, the Python SDK will stream the result set from the server, where the client will start a persistent connection with the server and only read the header until the Rows are enumerated; then, each row or JSON object will be de-serialized one at a time.

    This decreases pressure on Garbage Collection and helps to prevent OutOfMemory errors.

    Additional Resources

    N1QL is not the only query option in Couchbase. Be sure to check that your use case fits your selection of query service.