Aggregate Query Optimization: Performance
Module: Query Optimization & Performance
**GROUP BY Performance:**
No index (sort-based grouping):
SELECT department, COUNT(*) FROM employees GROUP BY department;
- Algorithm: Full scan + sort + aggregate
- Rows: 10M
- Comparisons: 10M × log(10M) = 230M
- Memory: 500MB for sort
- Time: 12 seconds
With index (index-based grouping):
CREATE INDEX idx_dept ON employees(department);
- Algorithm: Sequential index scan + aggregate
- Rows: 10M (but pre-sorted)
- Comparisons: 10M (linear)
- Memory: Minimal (streaming)
- Time: 0.8 seconds (15x faster)
**Covering Index Impact:**
Partial index:
CREATE INDEX idx_dept ON employees(department);
SELECT department, AVG(salary) FROM employees GROUP BY department;
- Index scan: 10M rows
- Table lookups: 10M (to get salary)
- I/O: 10M random reads
- Time: 8 seconds
Covering index:
CREATE INDEX idx_dept_salary ON employees(department, salary);
- Index-only scan: 10M rows
- Table lookups: 0
- I/O: Sequential index read
- Time: 0.5 seconds (16x faster)
**WHERE Filtering Impact:**
| Rows Before | Rows After WHERE | Aggregate Time | Speedup |
|-------------|------------------|----------------|----------|
| 10M | 10M (no filter) | 8s | 1x |
| 10M | 5M (50% filter) | 4s | 2x |
| 10M | 1M (90% filter) | 0.8s | 10x |
| 10M | 100K (99% filter) | 0.08s | 100x |
Rule: Every 10x reduction in rows = 10x faster aggregate.
**Materialized View Trade-offs:**
Direct query:
- Compute time: 25 seconds per request