Subquery Optimization: Mistakes
Module: Query Optimization & Performance
SELECT c.*, (SELECT COUNT(*) FROM orders WHERE customer_id = c.id) FROM customers c
SELECT c.*, COUNT(o.id) FROM customers c LEFT JOIN orders o ON c.id = o.customer_id GROUP BY c.id
Correlated subquery in SELECT executes for every customer. 1M customers = 1M subquery executions. JOIN executes once.
Never use correlated subqueries in SELECT
Critical
Correlated subquery executes per row
SELECT * FROM products WHERE category_id IN (SELECT id FROM categories WHERE parent_id = 5)
SELECT p.* FROM products p JOIN categories c ON p.category_id = c.id WHERE c.parent_id = 5
IN processes entire subquery result and checks each product against all values. JOIN is single efficient operation.
Use JOIN instead of IN for large subqueries
High
IN with large subquery (10K rows)
SELECT * FROM customers WHERE id NOT IN (SELECT customer_id FROM orders)
SELECT c.* FROM customers c LEFT JOIN orders o ON c.id = o.customer_id WHERE o.customer_id IS NULL
If any order has NULL customer_id, NOT IN returns empty result. LEFT JOIN with NULL check handles NULLs correctly.
Use LEFT JOIN with NULL check instead of NOT IN
High
NOT IN returns no rows if subquery has NULL
SELECT o.*, (SELECT AVG(total) FROM orders o2 WHERE o2.customer_id = o1.customer_id) FROM orders o1
SELECT o.*, AVG(total) OVER (PARTITION BY customer_id) FROM orders o
Correlated subquery executes per order (1M times). Window function processes data in single pass.
Use window functions instead of scalar subqueries
High
Correlated subquery for aggregation
WHERE EXISTS (SELECT 1 FROM orders WHERE customer_id = c.id) without index
CREATE INDEX idx_orders_customer_id ON orders(customer_id); then use EXISTS
Without index, EXISTS scans entire orders table for each customer. Index provides 50-100x speedup.
Always index columns used in subquery conditions
Medium
Full table scan per customer