Simulate Partial Rollup with Rollup and Grouping Sets

Paul Bares
3 min readSep 3, 2023

--

Not all SQL databases support the Partial Rollup SQL syntax. For the record, Partial Rollup reduces the number of subtotals produced by the regular Rollup syntax.

For instance, with Rollup:

select category, subcategory, sum(sales) as `Annual Sales` from order group by rollup(category, subcategory)
+------------------+------------------+--------------+
| category | subcategory | Annual Sales |
+------------------+------------------+--------------+
| null | null | 36200.0 |
| Appliances | null | 4600.0 |
| Appliances | Large Appliances | 3000.0 |
| Appliances | Small Appliances | 1600.0 |
| Garden & Outdoor | null | 12000.0 |
| Garden & Outdoor | Garden | 7000.0 |
| Garden & Outdoor | Outdoor | 5000.0 |
| Home & Kitchen | null | 15000.0 |
| Home & Kitchen | Home | 5000.0 |
| Home & Kitchen | Kitchen | 10000.0 |
+------------------+------------------+--------------+

You might not be interested in seeing the Grand Total i.e the first row of the table. This is the goal of Partial Rollup:

select category, subcategory, sum(sales) as `Annual Sales` from order group by group by `category`, rollup(`subcategory`)

Note that depending on the SQL database, the syntax may differ a little.

+------------------+------------------+--------------+
| category | subcategory | Annual Sales |
+------------------+------------------+--------------+
| Appliances | null | 4600.0 |
| Appliances | Large Appliances | 3000.0 |
| Appliances | Small Appliances | 1600.0 |
| Garden & Outdoor | null | 12000.0 |
| Garden & Outdoor | Garden | 7000.0 |
| Garden & Outdoor | Outdoor | 5000.0 |
| Home & Kitchen | null | 15000.0 |
| Home & Kitchen | Home | 5000.0 |
| Home & Kitchen | Kitchen | 10000.0 |
+------------------+------------------+--------------+

It is supported by DuckDB, Apache Spark and Snowflake.

Partial Rollup not supported but Grouping Sets is available

This is the case for instance with ClickHouse or BigQuery.

In Grouping Sets function, you define the list of group of aggregates you want. In the below example, it means you want to see “aggregates by category and subcategory” and “aggregate by category”. The Grand Total is indicated with () in grouping sets. The below example will produce the same result as above because () (Grand Total is not in the grouping sets)

group by grouping sets ((category, subcategory), (category))
select category, subcategory, sum(sales) as `Annual Sales` from order group by grouping sets ((category, subcategory),(category))

To not have to think about partial rollup, grouping sets are supported or not, you can try SquashQL. It will generate the right SQL query depending on the underlying database. The corresponding SQL query written with SquashQL TypeScript SQL-like query builder looks like this and it will work with Apache Spark, ClickHouse, BigQuery, Snowflake and DuckDB.

const query = from("order")
.select(["category", "subcategory"], [], [sum("Annual Sales", "sales")])
.rollup("subcategory"])
.build()
+------------------+------------------+--------------+
| category | subcategory | Annual Sales |
+------------------+------------------+--------------+
| Appliances | Total | 4600.0 |
| Appliances | Large Appliances | 3000.0 |
| Appliances | Small Appliances | 1600.0 |
| Garden & Outdoor | Total | 12000.0 |
| Garden & Outdoor | Garden | 7000.0 |
| Garden & Outdoor | Outdoor | 5000.0 |
| Home & Kitchen | Total | 15000.0 |
| Home & Kitchen | Home | 5000.0 |
| Home & Kitchen | Kitchen | 10000.0 |
+------------------+------------------+--------------+

null values corresponding to totals/subtotals are replaced with Total values. See this article to know how it works.

I built a small UI with Next.js and DuckDB Wasm to see the SQL equivalence between Partial Rollup and Grouping Sets

https://squashql.github.io/squashql-showcase/duckdb

Do you want to try it? Check out https://github.com/squashql/squashql and follow the tutorial.

Claps, shares and stars ⭐ are very much appreciated!

--

--

Paul Bares
Paul Bares

Written by Paul Bares

I'm an enthusiast in computer hardware and programming. I specialize in high performance and parallel computing. Co-Creator of SquashQL Github: squashql

No responses yet