Aggregate Query Optimization: Real-World
Module: Query Optimization & Performance
Analytics dashboards with GROUP BY queries taking 45 seconds - optimize to under 1 second. Sales reports aggregating millions of transactions - reduce from 5 minutes to 3 seconds. Real-time metrics with COUNT/SUM/AVG - handle 10x more traffic.
Shopify: Product Analytics Dashboard Optimization
Shopify's merchant dashboard shows product performance metrics: sales by product, revenue trends, inventory levels. Original query aggregated 500M order line items and took 2 minutes. Merchants complained dashboard was unusable. Needed sub-second response for 100K merchants.
**Problem Query (120 seconds):**
```sql
SELECT
p.product_id,
p.product_name,
COUNT(DISTINCT o.order_id) as order_count,
SUM(ol.quantity) as units_sold,
SUM(ol.quantity * ol.price) as revenue,
AVG(ol.price) as avg_price
FROM products p
LEFT JOIN order_lines ol ON p.product_id = ol.product_id
LEFT JOIN orders o ON ol.order_id = o.order_id
WHERE o.created_at >= CURRENT_DATE - INTERVAL '30 days'
AND p.merchant_id = :merchant_id
GROUP BY p.product_id, p.product_name
ORDER BY revenue DESC
LIMIT 100;
```
Problems:
- Joins 500M order_lines with 200M orders
- COUNT(DISTINCT) requires deduplication
- No indexes on join columns
- Aggregates per merchant on every page load
**Solution 1: Covering Indexes (10x faster)**
```sql
-- Index for order_lines join and aggregation
CREATE INDEX idx_order_lines_product ON order_lines(
product_id,
order_id,
quantity,
price
);
-- Index for orders filtering
CREATE INDEX idx_orders_merchant_date ON orders(
merchant_id,
created_at,
order_id