Indexes can speedup both selection queries and join queries if they are applied properly. The following two sections describe scenarios that indexes are explored inside the system for query processing purpose.
The query optimizer chooses to use a secondary index for query execution if both of the following conditions are met:
The query contains a conjunctive equality or range predicate over one or more fields, or a join predicate (see the next section). The conjunctive predicate has a form:
QualifiedName Operator Literal ( AND field Operator Literal )+
There is an index with a key, such that the corresponding fields in the predicate form a prefix of that key. For example, suppose that there is an index on dataset
foowith key fields
CREATE INDEX idx_s_d ON foo(c_s:STRING, c_d:DOUBLE);
The following query uses the index because it has an equality predicate (=) on a field (c_s) that is a prefix of the indexed key (c_s, c_d):
SELECT f.c_x as res FROM foo f WHERE f.c_s = 'world';
If you would like an available index to not be used for a particular query predicate (e.g., because there will be many, many matching objects), a skip-index hint can be used:
SELECT f.c_x as res FROM foo f WHERE f.c_s /*+ skip-index */ = 'world';
If multiple indexes are eligible access paths, there can be two cases:
Two or more indexes are sharing the same prefix and the predicate is on that prefix: For example:
indexAis on (c1, c2),
indexBis on (c1, c3) and the predicate is on c1 (c1 = 100). In this case, the query optimizer picks the first index in the order of their names, e.g.,
No indexes share the same prefix and the predicate refers to fields from each individual index. For example:
indexAin on (c1) and
indexBis on (c2), the predicate is on c1 and c2 (c1 = 100 and c2 = 200). In this case, both indexes will be used to retrieve matched primary keys and then the key sets will be intersected to further filter retrieved primary keys.
N1QL for Analytics supports joins from standard SQL in the following forms:
SELECT * FROM ds_outer, ds_inner WHERE <predicate>; SELECT * FROM ds_outer JOIN ds_inner ON <predicate>;
Left outer join:
SELECT * FROM ds_outer LEFT OUTER JOIN ds_inner ON <predicate>;
ds_outer is the
outer branch and ds_inner is the
(in the order in which they appear in the FROM clause).
The query optimizer picks an index for join evaluation if the following conditions are met:
The join predicate is an equality or range predicate that refers to fields from both branches of the join, in the form of:
fn(expr_outer) /*+ indexnl */ op field_inner
field_inneris a field from the inner dataset on which the index is defined;
The index join hint, /*+ indexnl */, is provided for the join predicate;
fn() is a function that returns the same data type as the type specified in the index for the
field_inner. Usually, fn() is one of the following: to_string(), to_double(), or to_bigint().
Note that if the type of
expr_outer is not known at compile time therefore the index join cannot be
selected if the join predicate is just
expr_outer /*+ indexnl */ op field_inner.
For example, suppose there are two shadow datasets: foo1 and foo2, with an index on foo2:
CREATE index idx_f2 ON foo2(c_s2:string);
Then the following query would use this index for join evaluation:
SELECT f1.c_x1 as c1, f2.c_x2 as c2 FROM foo1 AS f1, foo2 AS f2 WHERE to_string(f1.c_s1) /*+ indexnl */ = f2.c_s2