SQL Practice Logo

SQLPractice Online

Recursive CTEs for Hierarchical Data: Performance

Module: Subqueries & CTEs

**Indexing Strategy:**

```sql

-- Critical: Index the parent column

CREATE INDEX idx_parent ON employees(manager_id);

CREATE INDEX idx_parent_id ON categories(parent_id);

-- Composite index for filtered hierarchies

CREATE INDEX idx_parent_status ON employees(manager_id, status)

WHERE status = 'active';

```

**Performance Tips:**

1. **Always limit recursion depth**: Add WHERE level < max_depth to prevent runaway queries

2. **Index parent columns**: The recursive JOIN uses parent_id heavily

3. **Filter early**: Apply WHERE conditions in base case when possible

4. **Materialized views**: For static hierarchies, pre-compute and cache results

5. **Partition large tables**: Partition by department or region to reduce scan size

**Benchmarks (1M employee records, 10 levels deep):**

- Without index on manager_id: 45 seconds

- With index on manager_id: 0.8 seconds

- Materialized view (refreshed daily): 0.02 seconds

**When NOT to use recursive CTEs:**

- Very deep hierarchies (>20 levels): Consider closure table pattern

- Frequent updates: Materialized path or nested sets may be faster

- Simple parent lookup: Regular JOIN is sufficient

CREATE INDEX idx_parent ON table(parent_id) - critical for performance

Limit recursion depth to prevent runaway queries

Use materialized views for hierarchies that change infrequently

Partition large tables by department or region

Filter early in base case to reduce recursion scope

Avoid SELECT * - only select needed columns

Consider closure table pattern for very deep hierarchies (>20 levels)

Use EXPLAIN ANALYZE to identify bottlenecks

Cache results in application layer for frequently accessed hierarchies

Benchmark: 1M rows with index = 0.8s, without = 45s

Forgetting WHERE parent_id IS NULL in base case (returns empty)

Using UNION instead of UNION ALL (removes duplicates, breaks recursion)

No recursion limit causes infinite loops and timeouts

Missing index on parent column causes full table scans

Circular references (A→B→C→A) cause infinite recursion

Path column too short causes truncation (use CAST)

Incorrect JOIN condition joins wrong parent-child pairs