Query Rewriting Patterns: Performance
Module: Query Optimization & Performance
**OR to UNION ALL Performance:**
OR query:
SELECT * FROM orders WHERE status = 'pending' OR priority = 'urgent';
- Execution: Full table scan (can't use both indexes)
- Rows scanned: 10M
- Time: 15 seconds
UNION ALL query:
SELECT * FROM orders WHERE status = 'pending'
UNION ALL
SELECT * FROM orders WHERE priority = 'urgent' AND status != 'pending';
- Execution: Two index scans
- Rows scanned: 500K + 200K = 700K
- Time: 0.5 seconds (30x faster)
When it helps:
- OR on different columns with indexes
- Each condition is selective (<10% of rows)
- Columns have good indexes
When it doesn't help:
- OR on same column: WHERE status IN ('A', 'B') is fine
- Conditions not selective (>50% of rows)
- No indexes on columns
**Subquery to JOIN Performance:**
Correlated subquery:
SELECT c.name, (SELECT COUNT(*) FROM orders WHERE customer_id = c.id)
FROM customers c;
- Executions: 100K (once per customer)
- Scans: 100K full scans of orders (10M rows each)
- Comparisons: 1 trillion
- Time: 45 seconds
JOIN:
SELECT c.name, COUNT(o.id)
FROM customers c LEFT JOIN orders o ON c.id = o.customer_id
GROUP BY c.id, c.name;
- Executions: 1 join
- Scans: 1 scan of each table
- Comparisons: 10.1M
- Time: 2 seconds (22x faster)
**NOT IN vs LEFT JOIN Performance:**
NOT IN:
SELECT * FROM customers WHERE id NOT IN (SELECT customer_id FROM orders);