Troubleshooting Queries
Description — Couchbase Lite on C# — Using query.explain()
Abstract — This content describes how to use the Couchbase Lite on C#.Net Query API’s explain() method to examine a query
Related Content — Using Logs | Predictive Queries | Live Queries | Indexing
Query Explain
Usage
Query’s Explain() method can provide useful insight when you’re trying to diagnose query performance issues and-or optimize queries.
To examine how your query is working, either embed the call inside your app (see: Example 1), or use it interactively within a cblite shell (see: Example 2).
using var query =
QueryBuilder
.Select(SelectResult.All())
.From(DataSource.Collection(collection))
.Where(Expression.Property("type").EqualTo(Expression.String("hotel")))
.GroupBy(Expression.Property("country"))
.OrderBy(Ordering.Property("title").Ascending()); (1)
Console.WriteLine(query.Explain()); (2)
| 1 | Construct your query as normal |
| 2 | Call the query’s explain method; All output is sent to the application’s log file. |
cblite is a command-line tool for inspecting and querying Couchbase Lite databases.
You can download and build it from the couchbaselabs GitHub repository.
The cblite tool is not supported by the Couchbase Support Policy.
|
cblite <your-database-name>.cblite2 (1)
(cblite) select --explain domains group by country order by country, name (2)
(cblite) query --explain {"GROUP_BY":[[".country"]],"ORDER_BY":[[".country"],[".name"]],"WHAT":[[".domains"]]} (3)
| 1 | Within a terminal session, open your database with cblite and enter your query |
| 2 | This example enters the query as a SQL++ query using select |
| 3 | This example enters the query as a JSON-string using query |
Output
The output from Explain() remains the same whether invoked by an app, or cblite — see Example 3 for an example of how it looks.
SELECT fl_result(fl_value(_doc.body, 'domains')) FROM kv_default AS _doc WHERE (_doc.flags & 1 = 0) GROUP BY fl_value(_doc.body, 'country') ORDER BY fl_value(_doc.body, 'country'), fl_value(_doc.body, 'name') (1)
7|0|0| SCAN TABLE kv_default AS _doc (2)
12|0|0| USE TEMP B-TREE FOR GROUP BY
52|0|0| USE TEMP B-TREE FOR ORDER BY
{"GROUP_BY":[[".country"]],"ORDER_BY":[[".country"],[".name"]],"WHAT":[[".domains"]]} (3)
This output (Example 3) comprises 3 main elements:
| 1 | The translated SQL++ query, which targets Couchbase support and engineering teams rather than general users. |
| 2 | The SQLite query plan, which gives a high-level view of how SQLite implements the query. You can use this to identify potential issues and so optimize problematic queries. |
| 3 | The query in JSON-string format, which you can copy-and-paste directly into the cblite tool. |
The Query Plan
Format
The query plan section of the output displays a tabular form of the translated query’s execution plan. It primarily shows how SQLite retrieves the data and, where appropriate, how it sorts results for navigation and presentation purposes. For more on SQLite’s Explain Query Plan — see: https://www.sqlite.org/eqp.html
7|0|0| SCAN TABLE kv_default AS _doc (1)
12|0|0| USE TEMP B-TREE FOR GROUP BY (2)
52|0|0| USE TEMP B-TREE FOR ORDER BY (3)
| 1 | Retrieval method — This line shows the retrieval method the query uses; in this case, a sequential read of the database. You may want to optimize this behavior — see Retrieval Method. |
| 2 | Grouping method --- This line shows that the Group By clause in the query requires sorted data and uses a b-tree for temporary storage — see Order and Group. |
| 3 | Ordering method — This line shows that the Order By clause in the query requires sorted data and uses a b-tree for temporary storage — see Order and Group. |
Retrieval Method
The query optimizer attempts to retrieve the requested data items as efficiently as possible, which generally involves using 1 or more of the available indexes. The retrieval method shows the approach chosen by the optimizer — see Table 1.
| Retrieval Method | Description |
|---|---|
Search |
Here the query can view the required data directly using keys into the index. Queries using the Search mode are the fastest. |
Scan Index |
Here the query is able to retrieve the data by scanning all or part-of the index (for example when seeking to match values within a range). This type of query is slower than search, but at least benefits from the compact and ordered form of the index. |
Scan Table |
Here the query must scan the database table(s) to retrieve the required data. It’s the slowest of these methods and benefits most from optimization. |
When looking to optimize a query’s retrieval method, consider whether:
-
Providing an additional index makes sense
-
You could use an existing index — perhaps by restructuring the query to minimize wildcard use, or the reliance on functions that modify the query’s interpretation of index keys (for example, 'lower')
-
You could reduce the dataset the query requests to minimize its footprint on the database
Order and Group
The Use temp b-tree for lines in the example indicate that the query requires sorting to cater for grouping and then sorting again to present the output results.
Minimizing, if not eliminating, this ordering and re-ordering obviously reduces the time taken to process the query.
Ask "is the grouping and-or ordering absolutely necessary?": if it’s not, drop it or modify it to minimize its impact.
Queries and Indexes
Querying documents using a pre-existing database index is much faster because an index reduces the set of documents to examine — see: the Query Troubleshooting topic.
When planning the indexes you need for your database, remember that while indexes make queries faster, they may also:
-
Make writes slightly slower, because the system updates each index whenever you update a document
-
Make your Couchbase Lite database slightly larger.
Excessive indexes may hurt performance. Optimal performance depends on designing and creating the right indexes to go along with your queries.
|
Constraints
Couchbase Lite for .Net does not currently support partial value indexes; indexes with non-property expressions.
You should only index with properties that you plan to use in the query.
|
The Query optimizer converts your query into a parse tree that groups zero or more and-connected clauses together (as dictated by your where conditionals) for effective query engine processing.
Ideally, a query should be able to satisfy its requirements entirely by either directly accessing the index or searching sequential index rows. Less good is if the query must scan the whole index; although the compact nature of most indexes means this is still much faster than the alternative of scanning the entire database with no help from the indexes at all.
Searches that begin with or rely upon an inequality with the primary key are inherently less effective than those using a primary key equality.
Working with the Query Optimizer
You may have noticed that sometimes a query runs faster on a second run, or after re-opening the database, or after deleting and recreating an index. This typically happens when SQL Query Optimizer has gathered sufficient stats to recognize a means of optimizing a sub-optimal query.
If only those stats were available from the start. In fact they’re gathered after certain events, such as:
-
Following index creation
-
On a database close
-
When running a database compact.
If your analysis of the Query Explain output indicates a sub-optimal query and your rewrites fail to sufficiently optimize it, consider compacting the database. Then re-generate the Query Explain and note any improvements in optimization. They may not, in themselves, resolve the issue entirely; but they can provide useful guide to further optimizing changes you could make.
Wildcard and Like-based Queries
Like-based searches can use the index(es) only if:
-
The search-string does not start with a wildcard
-
The primary search expression uses a property that’s indexed key
-
The search-string is a constant known at run time) (that is, not a value derived during processing of the query)
To show this, you can use a modified query from the Mobile Travel Sample application, replacing an equality test with a 'LIKE'
In Like with Wildcard Prefix we use a wildcard prefix and suffix.
You can see that the query plan decides on a retrieval method of Scan Table.
| For more on indexes — see: Indexing |
using var query =
QueryBuilder
.Select(SelectResult.All())
.From(DataSource.Collection(collection))
.Where(Expression.Property("type").Like(Expression.String("%hotel%"))
.And(Function.Lower(Expression.Property("name")).Like(Expression.String("%royal%")))); (1)
Console.WriteLine(query.Explain());
| 1 | The indexed property, TYPE, cannot use its index because of the wildcard prefix. |
2|0|0| SCAN TABLE kv_default AS _doc
By contrast, removing the wildcard prefix % (in Example 5) changes the query plan’s retrieval method to an index search.
Where practical, changes like this can make significant differences in query performance.
using var query =
QueryBuilder
.Select(SelectResult.All())
.From(DataSource.Collection(collection))
.Where(Expression.Property("type").Like(Expression.String("hotel%"))
.And(Function.Lower(Expression.Property("name")).Like(Expression.String("%royal%")))); (1)
Console.WriteLine(query.Explain());
| 1 | Simply removing the wildcard prefix enables the query optimizer to access the typeIndex, which results in a more efficient search. |
3|0|0| SEARCH TABLE kv_default AS _doc USING INDEX typeIndex (<expr>>? AND <expr><?)
Use Functions Wisely
Functions are a useful tool in building queries, but be aware that they can affect whether the query-optimizer is able to use your index(es).
For example, you can observe a similar situation to that shown in Wildcard and Like-based Queries when using the Lower() function on an indexed property.
using var query =
QueryBuilder
.Select(SelectResult.All())
.From(DataSource.Collection(collection))
.Where(Function.Lower(Expression.Property("type")).EqualTo(Expression.String("hotel"))); (1)
Console.WriteLine(query.Explain());
| 1 | Here we use the Lower() function in the Where expression |
2|0|0| SCAN TABLE kv_default AS _doc
Removing the Lower() function changes things:
using var query =
QueryBuilder
.Select(SelectResult.All())
.From(DataSource.Collection(collection))
.Where(Expression.Property("type").EqualTo(Expression.String("hotel"))); (1)
Console.WriteLine(query.Explain());
| 1 | Here we have removed Lower() from the Where expression |
3|0|0| SEARCH TABLE kv_default AS _doc USING INDEX typeIndex (<expr>=?)
Knowing this, you can consider how you create the index; for example, using Lower() when you create the index and then always using lowercase comparisons.
Optimization Considerations
Try to minimize the amount of data retrieved. Limit it to the properties you do need to achieve the required result.
Consider fetching details lazily. You could break complex queries into components. For example, return just the doc-ids first, then process the array of doc-ids using either the Document API or a query that uses the doc-ids to return detailed information.
Consider using paging to minimize the data returned when the number of results returned is expected to be high.
Getting the whole lot at once is slow and resource intensive.
Instead, retrieve batches of information at a time, perhaps using the Where method’s limit(offset) feature to set a starting point for each subsequent batch.
| Using query offsets becomes less effective as the overhead of skipping a growing number of rows each time increases. You can work around this by using ranges of search-key values instead. If the last search-key value of batch 1 was 'x', that becomes the starting point for your next batch, and so on. |
Optimize document size in design. Smaller docs load faster. Break your data into logical linked units.
Consider Using Full Text Search instead of complex like or regular expression patterns — see Full Text Search.