SQL Practice Logo

SQLPractice Online

Statistics & Query Planning: Examples

Module: Query Optimization & Performance

Stale Statistics Causing Nested Loop Disaster

advanced

After bulk data load, query suddenly takes 10 minutes instead of 5 seconds. EXPLAIN shows nested loop join instead of hash join. Statistics are outdated - optimizer thinks table has 100K rows but it actually has 10M rows.

-- Initial state: 100K orders, statistics current

SELECT o.order_id, c.name, o.total

FROM orders o

JOIN customers c ON o.customer_id = c.id

WHERE o.status = 'pending';

-- EXPLAIN shows hash join (good for 100K rows)

-- Execution time: 5 seconds

-- Bulk load: Insert 9.9M more orders (now 10M total)

INSERT INTO orders SELECT ... FROM staging_orders;

-- 9.9M rows inserted

-- Run same query again (statistics NOT updated)

SELECT o.order_id, c.name, o.total

FROM orders o

JOIN customers c ON o.customer_id = c.id

WHERE o.status = 'pending';

-- EXPLAIN shows nested loop join (disaster!)

-- Optimizer thinks: 100K orders (old stats)

-- Actual: 10M orders

-- Execution time: 10 minutes (120x slower)

-- Fix: Update statistics

ANALYZE orders;

-- Run query again

SELECT o.order_id, c.name, o.total

FROM orders o

JOIN customers c ON o.customer_id = c.id

WHERE o.status = 'pending';

-- EXPLAIN shows hash join again (correct plan)

-- Execution time: 8 seconds (75x faster than broken plan)

Before ANALYZE (stale statistics):

------------------------------------------------------------

Hash Join (cost=2500..5000 rows=10000)

-> Seq Scan on customers (cost=0..1000 rows=50000)

-> Hash (cost=1000..1000 rows=10000) -- WRONG estimate

-> Seq Scan on orders (cost=0..1000 rows=10000)

Filter: (status = 'pending')

-- Optimizer thinks 10K rows, actually 8M rows

Actual execution: