SQL Practice Logo

SQLPractice Online

Subquery Optimization: Performance

Module: Query Optimization & Performance

**Correlated Subquery Cost:**

Example: 1M customers, 10M orders

Correlated in SELECT:

SELECT c.*, (SELECT COUNT(*) FROM orders WHERE customer_id = c.id) FROM customers c;

- Executions: 1M subquery runs

- Scans: 1M full table scans of orders (10M rows each)

- Comparisons: 10 trillion

- Time: 45-60 seconds

JOIN alternative:

SELECT c.*, COUNT(o.id) FROM customers c LEFT JOIN orders o ON c.id = o.customer_id GROUP BY c.id;

- Executions: 1 join

- Scans: 1 scan of each table

- Comparisons: 11M

- Time: 0.5 seconds (90-120x faster)

**IN vs EXISTS vs JOIN:**

| Approach | Time | Rows Processed | Short-Circuit | NULL Safe |

|----------|------|----------------|---------------|------------|

| IN (small) | 0.5s | All subquery rows | No | No |

| IN (large) | 8s | All subquery rows | No | No |

| EXISTS | 1.2s | Until first match | Yes | Yes |

| JOIN | 0.8s | All matching rows | No | Yes |

Rule: EXISTS for existence, JOIN for data retrieval

**Subquery Materialization:**

Modern optimizers can materialize subqueries:

- Execute subquery once

- Store results in temp table

- Use temp table in outer query

But: Not all subqueries can be materialized

- Correlated subqueries: NO

- Large result sets: May spill to disk

- Complex subqueries: May not optimize

**Index Impact:**

Correlated subquery without index:

WHERE EXISTS (SELECT 1 FROM orders WHERE customer_id = c.id)

- Full table scan per customer

- 1M customers × 10M orders = 10 trillion comparisons

- Time: 60+ seconds

With index on orders(customer_id):

- Index lookup per customer

- 1M customers × log(10M) ≈ 23M comparisons