Query Rewriting Techniques: Interview
Module: Query Optimization & Performance
Explain predicate pushdown and why it improves performance. Provide an example.
Predicate pushdown moves WHERE filter conditions as close to the data source as possible, reducing the amount of data processed by expensive operations like joins and aggregations.
**Why It Works:**
Filtering early reduces data volume:
- 10M rows filtered to 1M = 90% reduction
- Subsequent joins process 1M rows instead of 10M
- 10x less data = ~10x faster
**Example:**
Before (filter after join):
SELECT * FROM (
SELECT o.*, c.name
FROM orders o
JOIN customers c ON o.customer_id = c.id
) t
WHERE t.created_at >= '2024-01-01';
Execution:
1. Join 10M orders with 1M customers = 10M rows
2. Filter to January orders = 1M rows
3. Processed 10M rows to get 1M
4. Time: 45 seconds
After (filter before join):
SELECT o.*, c.name
FROM (
SELECT * FROM orders
WHERE created_at >= '2024-01-01'
) o
JOIN customers c ON o.customer_id = c.id;
Execution:
1. Filter to 1M January orders
2. Join 1M orders with 1M customers = 1M rows
3. Processed 2M rows to get 1M
4. Time: 5 seconds (9x faster)
**Rule:**
Push WHERE conditions to innermost subquery. Filter before joins, aggregations, and sorts.
Optimize this slow query using advanced rewriting techniques. Explain your approach.
**Original Query (2 minutes):**
SELECT
c.customer_name,
(SELECT COUNT(*) FROM orders WHERE customer_id = c.id) as order_count,
(SELECT SUM(total) FROM orders WHERE customer_id = c.id) as total_spent,