A newer version of this documentation is available.

View Latest

UPDATE STATISTICS

  • reference
    +
    The UPDATE STATISTICS statement collects statistics on expressions over a named keyspace. These statistics are used by the cost-based optimizer when choosing the optimal plan to execute a query.

    Purpose

    The cost-based optimizer uses statistics to generate a query plan with the lowest processing cost, for SELECT, UPDATE, DELETE, MERGE, and INSERT INTO with SELECT queries.

    After creating the indexes, and before running a query, you must use the UPDATE STATISTICS statement to gather distribution statistics for the expressions used in the query. The cost-based optimizer uses these distribution statistics, together with keyspace and index statistics, to select the optimal indexes and generate the query plan.

    Syntax

    update-statistics ::= update-statistics-expressions | update-statistics-index |
                          updated-statistics-indexes | update-statistics-delete
    update-statistics-expressions | update-statistics-index | updated-statistics-indexes | update-statistics-delete

    The UPDATE STATISTICS statement provides several different syntaxes to enable you to manage statistics. For further details, refer to the following links.

    Usage

    When you use an index with a query, you typically create the index on the fields which the query uses to filter. To use the cost-based optimizer with that query, you must collect statistics on the same fields that you used to create the index.

    A query may have predicates on non-indexed fields, and you can collect statistics on those fields also to help the optimizer.

    For a query which filters on an array or array of objects, you must collect the statistics using exactly the same expression that you used to create the index.

    Result

    The statement returns an empty array.

    The UPDATE STATISTICS stores statistics in a system-like collection called N1QL_CBO_STATS. This is stored in a scope called N1QL_SYSTEM_SCOPE within a bucket called N1QL_SYSTEM_BUCKET. This bucket, scope, and collection are created automatically, if they do not already exist.

    Security for Optimizer Statistics

    Access to this bucket, scope, and collection is restricted to system administrators. Regular users do not have access to the bucket, scope, or collection. Users with system administration privileges can access or drop them. If the bucket, scope, or collection is dropped, all optimizer statistics stored in the collection will be lost. A new bucket, scope, and collection will be created again the next time an UPDATE STATISTICS statement is run.

    Memory Quota for Optimizer Statistics

    When the N1QL_SYSTEM_BUCKET is created, it has the minimum memory quota of 100 MB. It is recommended that system administrators monitor the memory usage of this bucket. For large clusters, when this bucket gets big, increase the memory quota for this bucket such that it does not fall under 15% memory-resident. (This is recommended for all buckets to avoid potential issues with rebalancing, etc.)

    Optimizer Statistics for Multiple Query Nodes

    When a cluster has multiple Query nodes, statistics gathered from one Query node via the UPDATE STATISTICS statement are automatically propagated to the other Query nodes. It is not necessary to run UPDATE STATISTICS with the same index expression for multiple query nodes.