Join Optimization Strategies: Mistakes
Module: Query Optimization & Performance
SELECT * FROM orders o JOIN customers c ON o.customer_id = c.id
CREATE INDEX idx_orders_customer_id ON orders(customer_id); then run query
Without index on join column, database scans entire customers table for each order row. 1000 orders × 100K customers = 100M comparisons.
Always index foreign keys first
Critical
Full table scan on customers for each order
SELECT * FROM orders o JOIN customers c ON o.customer_id = c.id WHERE o.order_date > '2024-01-01'
CREATE INDEX idx_orders_date_customer ON orders(order_date, customer_id)
Composite index allows filtering by date first (10M to 500K rows), then joining. Single-column index joins all 10M rows first.
Composite index: filter columns first, join columns second
High
Index only on customer_id, filter applied after join
FROM order_items oi JOIN orders o ... WHERE o.order_date > '2024-01-01'
FROM orders o JOIN order_items oi ... WHERE o.order_date > '2024-01-01'
Starting with orders allows filtering first (10M to 500K), then joining to order_items. Starting with order_items joins all 50M rows.
Start with table that has WHERE filter
High
Starts with largest table (50M rows) before filtering
Query slow after adding data, indexes exist
ANALYZE orders; ANALYZE customers; (update statistics)
Optimizer uses statistics to estimate row counts. If stats say 1K rows but table has 1M, it picks nested loop instead of hash join.
Run ANALYZE weekly on large tables
Medium
Outdated statistics, optimizer chooses wrong algorithm
CREATE INDEX ON orders(customer_id); CREATE INDEX ON orders(order_date); CREATE INDEX ON orders(status); ... (10 indexes)
CREATE INDEX ON orders(customer_id, order_date, status); (1 composite index)
Each index slows writes by 10-20%. 10 indexes = 100-200% overhead. One composite index covers multiple query patterns.
Limit to 5-7 indexes per table
Medium
Too many indexes, 100%+ write overhead