Common mistake in Correlated Subqueries Deep Dive

Correlated subquery for running totals has O(N²) complexity. For each row, it sums all previous rows. For 10,000 orders: 1+2+3+...+10,000 = 50 million operations. Window function has O(N) complexity - single pass through data maintaining running sum. For 1,000 orders: correlated takes 2500ms, window takes 18ms (140x faster). For 10,000 orders: correlated takes 25 seconds, window takes 180ms. For 100,000 orders: correlated times out, window takes 1.8 seconds.

Common mistake in Correlated Subqueries Deep Dive

Each correlated subquery executes per outer row. Three subqueries = 3x the cost. For 5,000 customers, that is 15,000 subquery executions. All three query the same table (orders) with same filter (customer_id). Much more efficient to JOIN once and aggregate all three values in single pass. LEFT JOIN ensures customers with no orders are included. COALESCE handles NULL for customers without orders.

Common mistake in Correlated Subqueries Deep Dive

Correlated subquery executes once per outer row. If outer query returns 100,000 rows but you only need 5,000, you are doing 95,000 unnecessary subquery executions. Filter outer query first to reduce execution count. WHERE clause on outer query is free (uses indexes), but each subquery execution is expensive. Reducing outer rows by 95% = 95% faster query.

Correlated Subqueries Deep Dive: Mistakes

Module: Subqueries & CTEs

-- Correlated subquery without index - DISASTER!

SELECT

e1.name,

e1.salary,

(SELECT AVG(e2.salary)

FROM employees e2

WHERE e2.department = e1.department) AS dept_avg

FROM employees e1;

-- No index on (department, salary)

-- For 10,000 employees:

-- 10,000 full table scans

-- 100 million row reads

-- Query time: 8000ms (8 seconds!)

-- Add composite index first

CREATE INDEX idx_employees_dept_salary

ON employees(department, salary);

-- Now query is much faster

SELECT

e1.name,

e1.salary,

(SELECT AVG(e2.salary)

FROM employees e2

WHERE e2.department = e1.department) AS dept_avg

FROM employees e1;

-- Query time: 450ms (18x faster!)

-- Or use window function (best)

SELECT

name,

salary,

AVG(salary) OVER (PARTITION BY department) AS dept_avg

FROM employees;

-- Query time: 95ms (84x faster than no index!)

Correlated subquery executes once per outer row. Without index, each execution does full table scan. For 10,000 employees, that is 10,000 full scans = 100 million row reads! Composite index on (department, salary) enables index seek - each execution reads only ~50 rows instead of 10,000. Even better: window function scans table once (10,000 rows total) instead of 10,000 times.

ALWAYS create composite index on (foreign_key, aggregate_column) for correlated subqueries. Check with EXPLAIN to verify index usage. Consider window functions as faster alternative.

Critical

Extremely slow query due to missing index on correlated columns

graph TB

A["10,000 Employees"] --> B["No Index"]

B --> C["10,000 × Full Scan<br/>100M row reads<br/>8000ms ❌"]

A --> D["With Index"]