Window Frames (ROWS vs RANGE): Mistakes

Module: Window Functions

AVG(sales) OVER (ORDER BY date RANGE BETWEEN 6 PRECEDING AND CURRENT ROW) -- Expecting exactly 7 rows

AVG(sales) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) -- Gets exactly 7 rows

RANGE includes all rows with tied ORDER BY values. If multiple sales occur on the same date, RANGE includes all of them, giving more than 7 rows. ROWS counts exactly 7 physical rows regardless of date ties.

Use ROWS for exact counts, RANGE only when you need to group tied values

High

Variable number of rows included due to tied dates, incorrect moving average

SUM(amount) OVER (ORDER BY timestamp ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) -- On data with multiple transactions per timestamp

SUM(amount) OVER (ORDER BY timestamp RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) -- Groups tied timestamps

When business logic requires that all transactions at the same timestamp be treated together, ROWS can arbitrarily split them based on physical row order. RANGE ensures all tied timestamps are grouped logically.

Use RANGE when tied ORDER BY values must be treated as a logical group

Medium

Arbitrary splitting of transactions that occurred at the same time

AVG(price) OVER (ORDER BY date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) -- Without secondary sort column

AVG(price) OVER (ORDER BY date, product_id ROWS BETWEEN 2 PRECEDING AND CURRENT ROW) -- Deterministic order

ROWS depends on physical row order. Without a secondary sort column to break ties, the database can return rows in any order for tied dates, causing inconsistent results across query executions.

Always include secondary sort columns with ROWS to ensure deterministic results

High

Non-deterministic results when dates are tied, different results on each run

COUNT(*) OVER (ORDER BY high_cardinality_id RANGE BETWEEN 10 PRECEDING AND CURRENT ROW) -- On unique IDs

COUNT(*) OVER (ORDER BY high_cardinality_id ROWS BETWEEN 10 PRECEDING AND CURRENT ROW) -- No ties expected

RANGE adds processing overhead to check for tied values. When ORDER BY columns are unique (like IDs), there are no ties to handle, making RANGE slower than ROWS with no benefit.

Use ROWS when ORDER BY columns are unique or ties are rare

Medium

Unnecessary performance overhead from tie-checking on unique values

Using RANGE without testing on production data with actual tie patterns

Test both ROWS and RANGE with representative data to understand tie behavior

RANGE behavior depends entirely on the tie patterns in your data. Test data with few ties might work fine, but production data with many ties can give dramatically different results.

Always test frame behavior with production-like data before deployment

High

Unexpected results in production when ties occur differently than in test data

MAX(value) OVER (ORDER BY date RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) -- On large dataset with many daily ties

Pre-aggregate daily, then use ROWS, or optimize with proper indexing

RANGE with many ties on large datasets can be very slow. Each row must check for all tied values, creating O(n²) behavior in worst cases. Pre-aggregation or ROWS on processed data is often faster.

Consider pre-aggregation or ROWS alternatives for performance-critical RANGE operations

Medium

Poor performance due to extensive tie processing on large datasets