Window Frames (ROWS vs RANGE): Mistakes
Module: Window Functions
AVG(sales) OVER (ORDER BY date RANGE BETWEEN 6 PRECEDING AND CURRENT ROW) -- Expecting exactly 7 rows
AVG(sales) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) -- Gets exactly 7 rows
RANGE includes all rows with tied ORDER BY values. If multiple sales occur on the same date, RANGE includes all of them, giving more than 7 rows. ROWS counts exactly 7 physical rows regardless of date ties.
Use ROWS for exact counts, RANGE only when you need to group tied values
High
Variable number of rows included due to tied dates, incorrect moving average
SUM(amount) OVER (ORDER BY timestamp ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) -- On data with multiple transactions per timestamp
SUM(amount) OVER (ORDER BY timestamp RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) -- Groups tied timestamps
When business logic requires that all transactions at the same timestamp be treated together, ROWS can arbitrarily split them based on physical row order. RANGE ensures all tied timestamps are grouped logically.
Use RANGE when tied ORDER BY values must be treated as a logical group
Medium
Arbitrary splitting of transactions that occurred at the same time
AVG(price) OVER (ORDER BY date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) -- Without secondary sort column
AVG(price) OVER (ORDER BY date, product_id ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) -- Deterministic order
ROWS depends on physical row order. Without a secondary sort column to break ties, the database can return rows in any order for tied dates, causing inconsistent results across query executions.
Always include secondary sort columns with ROWS to ensure deterministic results
High
Non-deterministic results when dates are tied, different results on each run
COUNT(*) OVER (ORDER BY high_cardinality_id RANGE BETWEEN 10 PRECEDING AND CURRENT ROW) -- On unique IDs
COUNT(*) OVER (ORDER BY high_cardinality_id ROWS BETWEEN 10 PRECEDING AND CURRENT ROW) -- No ties expected
RANGE adds processing overhead to check for tied values. When ORDER BY columns are unique (like IDs), there are no ties to handle, making RANGE slower than ROWS with no benefit.
Use ROWS when ORDER BY columns are unique or ties are rare
Medium
Unnecessary performance overhead from tie-checking on unique values
Using RANGE without testing on production data with actual tie patterns
Test both ROWS and RANGE with representative data to understand tie behavior
RANGE behavior depends entirely on the tie patterns in your data. Test data with few ties might work fine, but production data with many ties can give dramatically different results.
Always test frame behavior with production-like data before deployment
High
Unexpected results in production when ties occur differently than in test data
MAX(value) OVER (ORDER BY date RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) -- On large dataset with many daily ties
Pre-aggregate daily, then use ROWS, or optimize with proper indexing
RANGE with many ties on large datasets can be very slow. Each row must check for all tied values, creating O(n²) behavior in worst cases. Pre-aggregation or ROWS on processed data is often faster.
Consider pre-aggregation or ROWS alternatives for performance-critical RANGE operations
Medium
Poor performance due to extensive tie processing on large datasets