SQL Practice Logo

SQLPractice Online

Query Rewriting Techniques: Real-World

Module: Query Optimization & Performance

Complex report query with nested subqueries taking 5 minutes - apply predicate pushdown reducing to 30 seconds. Dashboard with 10 joins where 3 are unnecessary - eliminate them for 5x speedup. Recursive CTE causing performance issues - rewrite with iterative approach for 20x improvement.

Google: Predicate Pushdown in Analytics Pipeline

Google Analytics processing pipeline had complex query with filters applied after multiple joins. Query processed 100B rows taking 2 hours. Applied predicate pushdown, reducing to 10 minutes (12x faster).

Original query joined events (100B rows) with users (1B), sessions (10B), and pages (5B) before filtering to last 7 days. Pushed date filter to events table first, reducing to 5B rows. Then applied user segment filter, reducing to 500M rows. Finally joined with other tables. Result: 100B rows processed → 5B rows processed. Time: 2 hours → 10 minutes (12x faster). Saved $50K/day in compute costs.

SELECT * FROM (SELECT * FROM events WHERE date >= CURRENT_DATE - 7) e JOIN users u ON e.user_id = u.id WHERE u.segment = "premium";

12x faster queries (2 hours → 10 minutes)

Reduced data processed from 100B to 5B rows

Saved $50K/day in compute costs

Enabled real-time analytics (was batch-only)

Reduced pipeline latency from 4 hours to 20 minutes

PostgreSQL

Facebook: Join Elimination in News Feed Query

Facebook news feed query joined with 8 tables but only used columns from 3. Removed 5 unnecessary joins, improving query time from 500ms to 100ms (5x faster) for 2B daily requests.

Original query joined posts with users, friends, likes, comments, shares, groups, pages, and events. Analysis showed only posts, users, and friends columns were used. Removed 5 joins (likes, comments, shares, groups, pages, events). Each join added 50-100ms overhead. Result: 500ms → 100ms (5x faster). At 2B requests/day, saved 800K CPU hours/day.

SELECT p.*, u.name FROM posts p JOIN users u ON p.user_id = u.id JOIN friends f ON u.id = f.user_id WHERE f.friend_id = :current_user;

5x faster queries (500ms → 100ms)

Eliminated 5 unnecessary joins

Saved 800K CPU hours/day

Reduced database load by 60%

Improved user experience (faster feed loading)

MySQL

Netflix: Window Functions Replace Self-Joins

Netflix recommendation query used self-joins to rank movies per genre. Query took 5 minutes with O(n²) complexity. Rewrote with window functions, reducing to 10 seconds (30x faster).

Original query self-joined movies table to calculate rank within each genre: SELECT m1.* FROM movies m1 JOIN movies m2 ON m1.genre = m2.genre WHERE m2.rating >= m1.rating GROUP BY m1.id HAVING COUNT(*) <= 10. O(n²) complexity: 100K movies = 10B comparisons. Rewrote with window function: SELECT * FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY genre ORDER BY rating DESC) as rank FROM movies) WHERE rank <= 10. O(n log n) complexity: 100K movies = 1.6M comparisons. Time: 5 minutes → 10 seconds (30x faster).

SELECT * FROM (SELECT *, ROW_NUMBER() OVER (PARTITION BY genre ORDER BY rating DESC) as rank FROM movies) WHERE rank <= 10;

30x faster queries (5 minutes → 10 seconds)

Reduced complexity from O(n²) to O(n log n)

Enabled real-time recommendations

Reduced database CPU from 90% to 30%

Saved $1M/year in infrastructure costs

PostgreSQL