A newer version of this documentation is available.

View Latest

GROUP BY clause

    +
    The GROUP BY clause arranges aggregate values into groups, based on one more fields.

    Purpose

    Use the GROUP BY clause to arrange aggregate values into groups of one or more fields. This GROUP BY clause follows the WHERE clause and precedes the optional LETTING, HAVING, and ORDER BY clauses.

    Syntax

    GROUP BY expr [, expr2 ]*
      [ LETTING alias = expr [, alias2 = expr2 ]* ]
      [ HAVING cond ]
    |
    LETTING alias = expr [, alias2 = expr2 ]*
    group by clause

    Arguments

    expr

    [At least one is required] String or expression representing the aggregate function or fields to group together.

    LETTING letting-clause

    [Optional] Stores the result of a sub-expression in order to use it in subsequent clauses. letting clause

    alias

    String or expression representing the name of the clause to be referred to.

    expr

    String or expression representing the value of the LETTING alias variable.

    HAVING having-clause

    [Optional] To return items where aggregate values meet the specified conditions. having clause

    cond

    String or expression representing the clause of aggregate values.

    Limitations

    GROUP BY works only on a group key or aggregate function.

    If a query selects a field which is MISSING in some documents, the optimizer will not be able to choose a secondary index which uses that field as a leading key.

    To enable the optimizer to choose the required index, you must use a WHERE clause of some kind to filter out any documents in which the required field is MISSING. The minimum filter you can use to do this is IS NOT MISSING. This is usually only necessary in queries which do not otherwise have a WHERE clause; for example, some GROUP BY and aggregate queries.

    Examples

    Example 1. Group the unique landmarks by city and list the top 4 cities with the most landmarks in descending order
    SELECT city City, COUNT(DISTINCT name) LandmarkCount
    FROM `travel-sample`
    WHERE type = "landmark"
    GROUP BY city
    ORDER BY LandmarkCount DESC
    LIMIT 4;
    Results
    [
      {
        "City": "San Francisco",
        "LandmarkCount": 797
      },
      {
        "City": "London",
        "LandmarkCount": 443
      },
      {
        "City": "Los Angeles",
        "LandmarkCount": 284
      },
      {
        "City": "San Diego",
        "LandmarkCount": 197
      }
    ]
    Example 2. Use LETTING to find cities that have a minimum number of things to see
    SELECT city City, COUNT(DISTINCT name) LandmarkCount
    FROM `travel-sample`
    WHERE type = "landmark"
    GROUP BY city
    LETTING MinimumThingsToSee = 400
    HAVING COUNT(DISTINCT name) > MinimumThingsToSee;
    Results
    [
      {
        "City": "London",
        "LandmarkCount": 443
      },
      {
        "City": "San Francisco",
        "LandmarkCount": 797
      }
    ]
    Example 3. Use HAVING to specify cities that have more than 180 landmarks
    SELECT city City, COUNT(DISTINCT name) LandmarkCount
    FROM `travel-sample`
    WHERE type = "landmark"
    GROUP BY city
    HAVING COUNT(DISTINCT name) > 180;
    Results
    [
      {
        "City": "London",
        "LandmarkCount": 443
      },
      {
        "City": "Los Angeles",
        "LandmarkCount": 284
      },
      {
        "City": "San Francisco",
        "LandmarkCount": 797
      },
      {
        "City": "San Diego",
        "LandmarkCount": 197
      }
    ]
    The above HAVING clause must use the aggregate function COUNT instead of its alias LandmarkCount.
    Example 4. Use HAVING to specify landmarks that begin with an "S" or higher
    SELECT city City, COUNT(DISTINCT name) LandmarkCount
    FROM `travel-sample`
    WHERE type = "landmark"
    GROUP BY city
    HAVING city > "S";
    138 Results in 150ms
    [
      {
        "City": "Santa Barbara",
        "LandmarkCount": 53
      },
      {
        "City": "San Francisco",
        "LandmarkCount": 797
      },
      {
        "City": "Stable Yd",
        "LandmarkCount": 1
      },
      {
        "City": "Wembley",
        "LandmarkCount": 1
      },
    ...
    Example 5. Using WHERE yields the same results as HAVING, however, WHERE is faster
    SELECT city City, COUNT(DISTINCT name) LandmarkCount
    FROM `travel-sample`
    WHERE type = "landmark"
    AND city > "S"
    GROUP BY city
    138 Results in 94ms
    [
      {
        "City": "San Luis Obispo",
        "LandmarkCount": 1
      },
      {
        "City": "Twentynine Palms",
        "LandmarkCount": 1
      },
      {
        "City": "Westlake Village",
        "LandmarkCount": 1
      },
      {
        "City": "Surrey",
        "LandmarkCount": 1
      },
    ...
    The WHERE clause is faster because WHERE gets processed before any GROUP BY and doesn’t have access to aggregated values. HAVING gets processed after GROUP BY and is used to constrain the resultset to only those with aggregated values.

    For further examples, refer to Group By and Aggregate Performance.