Subquery Optimization: Mistakes

Module: Query Optimization & Performance

SELECT c.*, (SELECT COUNT(*) FROM orders WHERE customer_id = c.id) FROM customers c

SELECT c.*, COUNT(o.id) FROM customers c LEFT JOIN orders o ON c.id = o.customer_id GROUP BY c.id

Correlated subquery in SELECT executes for every customer. 1M customers = 1M subquery executions. JOIN executes once.

Never use correlated subqueries in SELECT

Critical

Correlated subquery executes per row

SELECT * FROM products WHERE category_id IN (SELECT id FROM categories WHERE parent_id = 5)

SELECT p.* FROM products p JOIN categories c ON p.category_id = c.id WHERE c.parent_id = 5

IN processes entire subquery result and checks each product against all values. JOIN is single efficient operation.

Use JOIN instead of IN for large subqueries

High

IN with large subquery (10K rows)

SELECT * FROM customers WHERE id NOT IN (SELECT customer_id FROM orders)

SELECT c.* FROM customers c LEFT JOIN orders o ON c.id = o.customer_id WHERE o.customer_id IS NULL

If any order has NULL customer_id, NOT IN returns empty result. LEFT JOIN with NULL check handles NULLs correctly.

Use LEFT JOIN with NULL check instead of NOT IN

High

NOT IN returns no rows if subquery has NULL

SELECT o.*, (SELECT AVG(total) FROM orders o2 WHERE o2.customer_id = o1.customer_id) FROM orders o1

SELECT o.*, AVG(total) OVER (PARTITION BY customer_id) FROM orders o

Correlated subquery executes per order (1M times). Window function processes data in single pass.

Use window functions instead of scalar subqueries

High

Correlated subquery for aggregation

WHERE EXISTS (SELECT 1 FROM orders WHERE customer_id = c.id) without index

CREATE INDEX idx_orders_customer_id ON orders(customer_id); then use EXISTS

Without index, EXISTS scans entire orders table for each customer. Index provides 50-100x speedup.

Always index columns used in subquery conditions

Medium

Full table scan per customer