Materialized CTEs: Performance
Module: Subqueries & CTEs
**Benchmarks (50M row orders table):**
| Scenario | Without MATERIALIZED | With MATERIALIZED | Improvement |
|----------|---------------------|-------------------|-------------|
| CTE used 1 time | 5s | 5.2s | -4% (overhead) |
| CTE used 2 times | 10s | 5.5s | 45% faster |
| CTE used 3 times | 15s | 6s | 60% faster |
| CTE used 5 times | 25s | 7s | 72% faster |
**Key Insights:**
1. **Materialization has overhead**: Storing results takes time/memory
2. **Break-even point**: Usually 2 references
3. **More references = bigger win**: 5 references = 72% faster
**Memory Considerations:**
Materialized CTEs store results in memory (or temp disk if too large). Monitor:
- work_mem (PostgreSQL): Increase if CTE results spill to disk
- tempdb (SQL Server): Watch for tempdb contention
**Best Practices:**
1. **Profile first**: Use EXPLAIN ANALYZE to measure actual impact
2. **Start with MATERIALIZED**: If CTE used 2+ times, default to MATERIALIZED
3. **Monitor memory**: Large materialized CTEs can cause memory pressure
4. **Test both ways**: Sometimes optimizer surprises you
5. **Document decisions**: Comment why you materialized or not
**Anti-Patterns:**
1. **Materializing single-use CTEs**: Adds overhead for no benefit
2. **Not checking execution plans**: Assumptions can be wrong
3. **Materializing simple lookups**: SELECT * FROM table WHERE id = 1 is cheap
4. **Ignoring memory limits**: 10GB materialized CTE can crash query
Break-even point: Usually 2 references (1 reference = overhead)
More references = bigger win (5 references = 72% faster)
Materialization stores results in work_mem (PostgreSQL)
Increase work_mem if materialized CTE spills to disk
Monitor tempdb usage (SQL Server) for materialization overhead
Large materialized CTEs (>1GB) can cause memory pressure
Benchmark: 50M rows, 3 uses = 60% faster with MATERIALIZED
Use indexes on base tables to speed up CTE computation
Consider partitioning for very large base tables
Cache frequently-used materialized results in application layer
Assuming CTEs always materialize (PostgreSQL 12+ inlines by default)
Not checking execution plan (assumptions can be wrong)
Materializing single-use CTEs (adds overhead for no benefit)
Materializing simple lookups (SELECT * FROM table WHERE id = 1)