SQL Practice Logo

SQLPractice Online

Subquery Optimization: Concept

Module: Query Optimization & Performance

Subquery optimization is about making queries-within-queries fast. Subqueries come in two types: non-correlated (runs once) and correlated (runs for every row - expensive!). Most slow subqueries can be rewritten as JOINs, CTEs, or window functions for 10-1000x speedup.

Think of it like looking up information. Bad approach: for each customer, search through all orders to find theirs (correlated subquery). Good approach: join customers with orders once, then filter (JOIN). The difference: 1 million lookups vs 1 join.

**Two Types of Subqueries:**

1. **Non-Correlated Subquery** - Independent, executes once

- Example: WHERE customer_id IN (SELECT id FROM customers WHERE country = 'US')

- Performance: Executes once, result cached

- Cost: O(n) for subquery + O(m) for outer query

- When OK: Small result sets (<1000 rows)

2. **Correlated Subquery** - References outer query, executes per row

- Example: WHERE EXISTS (SELECT 1 FROM orders WHERE customer_id = c.id)

- Performance: Executes for EVERY outer row

- Cost: O(n * m) - quadratic time

- Problem: 1000 customers = 1000 subquery executions

**Performance Impact:**

Correlated subquery in SELECT (worst case):

SELECT c.name, (SELECT COUNT(*) FROM orders WHERE customer_id = c.id) FROM customers c;

- 1M customers = 1M subquery executions

- Each subquery scans orders table

- Total: 1M table scans

- Time: 45+ seconds

JOIN alternative:

SELECT c.name, COUNT(o.id) FROM customers c LEFT JOIN orders o ON c.id = o.customer_id GROUP BY c.id;

- 1 join operation

- Time: 0.5 seconds (90x faster)

**Subquery Locations:**

1. **SELECT clause** (scalar subquery) - WORST for performance

- Executes per row

- Use window functions or JOINs instead

2. **WHERE clause** - Common, can be optimized

- IN: OK for small lists, slow for large

- EXISTS: Better than IN, short-circuits

- JOIN: Usually fastest

3. **FROM clause** (derived table) - Usually OK

- Executes once

- Modern optimizers handle well

- CTEs more readable

**Optimization Strategies:**

1. **Convert to JOIN**

- Correlated subquery → JOIN with GROUP BY

- IN subquery → INNER JOIN