GROUP BY Clause

      +
      The GROUP BY clause arranges aggregate values into groups, based on one or more fields.

      Purpose

      Use the GROUP BY clause to arrange aggregate values into groups of one or more fields. This GROUP BY clause follows the WHERE clause and precedes the optional LETTING, HAVING, and ORDER BY clauses.

      Syntax

      group-by-clause ::= 'GROUP' 'BY' group-term ( ',' group-term )*
                          letting-clause? having-clause? | letting-clause
      Syntax diagram
      group-term

      Group Term

      letting-clause

      LETTING Clause

      having-clause

      HAVING Clause

      Group Term

      group-term ::= expr ( ('AS')? alias )?
      Syntax diagram

      At least one group term is required.

      expr

      String or expression representing an aggregate function or field to group together.

      alias

      Assigns another name to the group term. For details, see AS Clause.

      Assigning an alias to the group term is optional. If you assign an alias, the AS keyword may be omitted.

      LETTING Clause

      letting-clause ::= 'LETTING' alias '=' expr ( ',' alias '=' expr )*
      Syntax diagram

      [Optional] Stores the result of a sub-expression in order to use it in subsequent clauses.

      alias

      String or expression representing the name of the clause to be referred to.

      expr

      String or expression representing the value of the LETTING alias variable.

      HAVING Clause

      having-clause ::= 'HAVING' cond
      Syntax diagram

      [Optional] To return items where aggregate values meet the specified conditions.

      cond

      String or expression representing the clause of aggregate values.

      Limitations

      GROUP BY works only on a group key or aggregate function.

      A query needs a predicate on a leading index key to ensure that the optimizer can select a secondary index for the query. Without a matching predicate, the query will use the primary index. The simplest predicate is WHERE leading-index-key IS NOT MISSING. This is usually only necessary in queries which do not otherwise have a WHERE clause; for example, some GROUP BY and aggregate queries. For more details, refer to Index Selection.

      Examples

      Example 1. Group the unique landmarks by city and list the top 4 cities with the most landmarks in descending order
      SELECT city City, COUNT(DISTINCT name) LandmarkCount
      FROM `travel-sample`.inventory.landmark
      GROUP BY city
      ORDER BY LandmarkCount DESC
      LIMIT 4;
      Results
      [
        {
          "City": "San Francisco",
          "LandmarkCount": 797
        },
        {
          "City": "London",
          "LandmarkCount": 443
        },
        {
          "City": "Los Angeles",
          "LandmarkCount": 284
        },
        {
          "City": "San Diego",
          "LandmarkCount": 197
        }
      ]
      Example 2. Use LETTING to find cities that have a minimum number of things to see
      SELECT city City, COUNT(DISTINCT name) LandmarkCount
      FROM `travel-sample`.inventory.landmark
      GROUP BY city
      LETTING MinimumThingsToSee = 400
      HAVING COUNT(DISTINCT name) > MinimumThingsToSee;
      Results
      [
        {
          "City": "London",
          "LandmarkCount": 443
        },
        {
          "City": "San Francisco",
          "LandmarkCount": 797
        }
      ]
      Example 3. Use HAVING to specify cities that have more than 180 landmarks
      SELECT city City, COUNT(DISTINCT name) LandmarkCount
      FROM `travel-sample`.inventory.landmark
      GROUP BY city
      HAVING COUNT(DISTINCT name) > 180;
      Results
      [
        {
          "City": "London",
          "LandmarkCount": 443
        },
        {
          "City": "Los Angeles",
          "LandmarkCount": 284
        },
        {
          "City": "San Francisco",
          "LandmarkCount": 797
        },
        {
          "City": "San Diego",
          "LandmarkCount": 197
        }
      ]
      The above HAVING clause must use the aggregate function COUNT instead of its alias LandmarkCount.
      Example 4. Use HAVING to specify landmarks that begin with an "S" or higher
      SELECT city City, COUNT(DISTINCT name) LandmarkCount
      FROM `travel-sample`.inventory.landmark
      GROUP BY city
      HAVING city > "S"
      ORDER BY city;
      Results
      [
        {
          "City": "Sacramento",
          "LandmarkCount": 2
        },
        {
          "City": "Saint Albans",
          "LandmarkCount": 5
        },
        {
          "City": "Saint Andrews",
          "LandmarkCount": 13
        },
        {
          "City": "Saint Annes Head",
          "LandmarkCount": 1
        },
      // ...

      (execution: 1s docs: 138)

      Example 5. Using WHERE yields the same results as HAVING, however, WHERE is faster
      SELECT city City, COUNT(DISTINCT name) LandmarkCount
      FROM `travel-sample`.inventory.landmark
      WHERE city > "S"
      GROUP BY city
      ORDER BY city;
      Results
      [
        {
          "City": "Sacramento",
          "LandmarkCount": 2
        },
        {
          "City": "Saint Albans",
          "LandmarkCount": 5
        },
        {
          "City": "Saint Andrews",
          "LandmarkCount": 13
        },
        {
          "City": "Saint Annes Head",
          "LandmarkCount": 1
        },
      // ...

      (execution: 480.2ms docs: 138)

      The WHERE clause is faster because WHERE gets processed before any GROUP BY and doesn’t have access to aggregated values. HAVING gets processed after GROUP BY and is used to constrain the resultset to only those with aggregated values.
      Example 6. Using an alias for a group term
      SELECT Hemisphere, COUNT(DISTINCT name) AS LandmarkCount
      FROM `travel-sample`.inventory.landmark AS l
      GROUP BY CASE
        WHEN l.geo.lon <0 THEN "West"
        ELSE "East"
      END AS Hemisphere;
      Results
      [
        {
          "Hemisphere": "East",
          "LandmarkCount": 459
        },
        {
          "Hemisphere": "West",
          "LandmarkCount": 3885
        }
      ]
      The CASE expression categorizes each landmark into the Western hemisphere if its longitude is negative, or the Eastern hemisphere otherwise. The alias in the GROUP BY clause enables you to refer to the CASE expression in the SELECT clause.