GROUP BY
GROUP BY turns detail rows into summary rows. You use it for reporting, dashboards, and data quality checks.
Concept Flow
flowchart LR
A[Raw rows] --> B[Filter with WHERE]
B --> C[Group rows]
C --> D[Aggregate values]
D --> E[Sort and present]
Syntax Patterns
| Pattern | When to use | Example |
|---|---|---|
| Single key grouping | One summary per category | GROUP BY department_id |
| Multi-key grouping | Hierarchical summaries | GROUP BY country, city |
| Group + aggregate | Compute metrics per group | COUNT(*), SUM(amount) |
SELECT department_id, COUNT(*) AS employee_count
FROM employees
GROUP BY department_id;
Practical Example
Sales by Region
SELECT region,
COUNT(*) AS orders,
SUM(total_amount) AS revenue,
AVG(total_amount) AS avg_order
FROM orders
WHERE order_date >= '2026-01-01'
GROUP BY region
ORDER BY revenue DESC;
| region | orders | revenue | avg_order |
|---|---|---|---|
| APAC | 210 | 129450.00 | 616.43 |
| EU | 175 | 102300.00 | 584.57 |
| US | 320 | 189100.00 | 590.94 |
Common Pitfalls
| Pitfall | Consequence | Prevention |
|---|---|---|
| Selecting non-grouped, non-aggregated columns | Query fails in strict SQL mode or returns ambiguous data | Select only grouped columns or aggregate functions |
| Grouping too many columns | Fragmented result with tiny groups | Start with reporting key, then add columns intentionally |
Missing pre-filter with WHERE | Slower scans and noisy metrics | Restrict input rows before grouping |
Quick Reference
SELECT key_col, COUNT(*) FROM t GROUP BY key_col;
SELECT key_col, SUM(amount) FROM t WHERE status = 'paid' GROUP BY key_col;
SELECT key_a, key_b, AVG(score) FROM t GROUP BY key_a, key_b;