A newer version of this documentation is available.

View Latest

Appendix 2: Performance Tuning

      +

      The SET statement can be used to override some cluster-wide configuration parameters for a specific request:

      SetStmnt
      SetStmnt

      As parameter identifiers are qualified names (containing a '.') they have to be escaped using backticks (``). Note that changing query parameters will not affect query correctness but only impact performance characteristics, such as response time and throughput.

      Parallelism Parameter

      The system can execute each request using multiple cores on multiple machines (a.k.a., partitioned parallelism) in a cluster. A user can manually specify the maximum execution parallelism for a request to scale it up and down using the following parameter:

      • compiler.parallelism: the maximum number of CPU cores can be used to process a query. There are three cases of the value p for compiler.parallelism:

        • p < 0 or p > the total number of cores in a cluster: the system will use all available cores in the cluster;

        • p = 0 (the default): the system will use the storage parallelism (the number of partitions of stored datasets) as the maximum parallelism for query processing;

        • all other cases: the system will use the user-specified number as the maximum number of CPU cores to use for executing the query.

      Example
      SET `compiler.parallelism` "16";
      
      SELECT c.name AS cname, o.orderno AS orderno
      FROM customers c JOIN orders o ON c.custid = o.custid;

      Memory Parameters

      In the system, each blocking runtime operator such as join, group-by and order-by works within a fixed memory budget, and can gracefully spill to disks if the memory budget is smaller than the amount of data they have to hold. A user can manually configure the memory budget of those operators within a query. The supported configurable memory parameters are:

      • compiler.groupmemory: the memory budget that each parallel group-by operator instance can use; 32MB is the default budget.

      • compiler.sortmemory: the memory budget that each parallel sort operator instance can use; 32MB is the default budget.

      • compiler.joinmemory: the memory budget that each parallel hash join operator instance can use; 32MB is the default budget.

      • compiler.windowmemory: the memory budget that each parallel window aggregate operator instance can use; 32MB is the default budget.

      For each memory budget value, you can use a 64-bit integer value with a 1024-based binary unit suffix (for example, B, KB, MB, GB). If there is no user-provided suffix, "B" is the default suffix. See the following examples.

      Example
      SET `compiler.groupmemory` "64MB";
      
      SELECT c.custid, COUNT(*)
      FROM customers c
      GROUP BY c.custid;
      Example
      SET `compiler.sortmemory` "67108864";
      
      SELECT VALUE o
      FROM orders AS o
      ORDER BY ARRAY_LENGTH(o.items) DESC;
      Example
      SET `compiler.joinmemory` "132000KB";
      
      SELECT c.name AS cname, o.ordeno AS orderno
      FROM customers c JOIN orders o ON c.custid = o.custid;

      Parallel Sort Parameter

      The following parameter enables you to activate or deactivate full parallel sort for order-by operations.

      When full parallel sort is inactive (false), each existing data partition is sorted (in parallel), and then all data partitions are merged into a single node.

      When full parallel sort is active (true), the data is first sampled, and then repartitioned so that each partition contains data that is greater than the previous partition. The data in each partition is then sorted (in parallel), but the sorted partitions are not merged into a single node.

      • compiler.sort.parallel: A boolean specifying whether full parallel sort is active (true) or inactive (false). The default value is false.

      Example
      SET `compiler.sort.parallel` "true";
      
      SELECT VALUE o
      FROM orders AS o
      ORDER BY ARRAY_LENGTH(o.items) DESC;

      Array Index Parameter

      The following parameter enables you to choose whether the query optimizer should replace applicable data scans with array index lookups.

      When this setting is true, the query optimizer attempts to utilize array indexes if they are available. You can set this property to false to make query optimizer skip array indexes.

      • compiler.arrayindex: A boolean specifying whether array indexes will be considered as an access method for applicable queries. The default value is true.

      Example
      SET `compiler.arrayindex` "false";
      
      SELECT o.orderno
      FROM orders o
      WHERE SOME i IN o.items
      SATISFIES i.price = 19.91;