Recursive CTEs for Hierarchical Data: Performance
Module: Subqueries & CTEs
**Indexing Strategy:**
```sql
-- Critical: Index the parent column
CREATE INDEX idx_parent ON employees(manager_id);
CREATE INDEX idx_parent_id ON categories(parent_id);
-- Composite index for filtered hierarchies
CREATE INDEX idx_parent_status ON employees(manager_id, status)
WHERE status = 'active';
```
**Performance Tips:**
1. **Always limit recursion depth**: Add WHERE level < max_depth to prevent runaway queries
2. **Index parent columns**: The recursive JOIN uses parent_id heavily
3. **Filter early**: Apply WHERE conditions in base case when possible
4. **Materialized views**: For static hierarchies, pre-compute and cache results
5. **Partition large tables**: Partition by department or region to reduce scan size
**Benchmarks (1M employee records, 10 levels deep):**
- Without index on manager_id: 45 seconds
- With index on manager_id: 0.8 seconds
- Materialized view (refreshed daily): 0.02 seconds
**When NOT to use recursive CTEs:**
- Very deep hierarchies (>20 levels): Consider closure table pattern
- Frequent updates: Materialized path or nested sets may be faster
- Simple parent lookup: Regular JOIN is sufficient
CREATE INDEX idx_parent ON table(parent_id) - critical for performance
Limit recursion depth to prevent runaway queries
Use materialized views for hierarchies that change infrequently
Partition large tables by department or region
Filter early in base case to reduce recursion scope
Avoid SELECT * - only select needed columns
Consider closure table pattern for very deep hierarchies (>20 levels)
Use EXPLAIN ANALYZE to identify bottlenecks
Cache results in application layer for frequently accessed hierarchies
Benchmark: 1M rows with index = 0.8s, without = 45s
Forgetting WHERE parent_id IS NULL in base case (returns empty)
Using UNION instead of UNION ALL (removes duplicates, breaks recursion)
No recursion limit causes infinite loops and timeouts
Missing index on parent column causes full table scans
Circular references (A→B→C→A) cause infinite recursion
Path column too short causes truncation (use CAST)
Incorrect JOIN condition joins wrong parent-child pairs