Window Functions vs CTEs: Interview
Module: Subqueries & CTEs
When should you use window functions vs CTEs? Explain the fundamental difference.
Fundamental difference: Window functions are about WHAT you calculate (row-level analytics), CTEs are about HOW you structure your query (organization).
Use Window Functions when:
1. You need row-level calculations while keeping all rows (rankings, running totals)
2. You want to compare current row with other rows (LAG, LEAD)
3. You need moving averages or cumulative calculations
4. You want percentiles or statistical functions
5. Performance is critical for analytics (100x faster than correlated subqueries)
Use CTEs when:
1. You need to break complex query into readable steps
2. You want to reuse a subquery multiple times
3. You need recursion (hierarchical data like org charts)
4. You want to improve query maintainability and debugging
5. You're doing multi-step data transformations
Key insight: They're complementary, not alternatives. Best practice is combining both - CTE for data preparation, window functions for analytics.
Example:
WITH monthly_sales AS (SELECT ... GROUP BY month) -- CTE: Prep
SELECT *, RANK() OVER (ORDER BY total DESC) FROM monthly_sales; -- Window: Analytics
Why are window functions 100x faster than CTEs with correlated subqueries for running totals?
Complexity difference:
Window Function: O(N log N)
- Sort data once: O(N log N)
- Accumulate in single pass: O(N)
- Total: O(N log N)
- For 1M rows: ~20 million operations
- Time: 0.4 seconds
CTE with Correlated Subquery: O(N²)
- For each row (N rows)
- Scan all previous rows (average N/2 rows)
- Total: N × N/2 = O(N²)
- For 1M rows: ~500 billion operations
- Time: 45 seconds
Why window function is optimized:
1. Database knows you're calculating running total
2. Uses specialized algorithm (sort + accumulate)
3. Single pass through sorted data
4. No repeated scans
Why correlated subquery is slow:
1. Database doesn't recognize pattern
2. Executes subquery for every row