Window Functions Performance Optimization: Mistakes

Module: Window Functions

SELECT name, salary, AVG(salary) OVER (PARTITION BY dept ORDER BY hire_date) FROM employees;

CREATE INDEX idx_dept_hire ON employees(dept, hire_date); SELECT name, salary, AVG(salary) OVER (PARTITION BY dept ORDER BY hire_date) FROM employees;

Window functions require indexes on both PARTITION BY and ORDER BY columns. Without proper indexing, the database must scan the entire table for each partition, leading to O(n²) performance.

Always create composite indexes: (partition_column, order_column)

Critical

Full table scan for each partition, extremely slow on large tables

SELECT *, SUM(amount) OVER (PARTITION BY customer_id ORDER BY date RANGE BETWEEN INTERVAL '30 days' PRECEDING AND CURRENT ROW) FROM orders;

SELECT *, SUM(amount) OVER (PARTITION BY customer_id ORDER BY date ROWS BETWEEN 30 PRECEDING AND CURRENT ROW) FROM orders;

RANGE frames require complex date arithmetic for each row. ROWS frames use simple row counting and are much faster. For time-based analysis, consider pre-aggregating data or using approximate row counts.

Use ROWS frames instead of RANGE when possible, especially with date calculations

High

RANGE frames with date intervals are computationally expensive

SELECT *, COUNT(*) OVER (PARTITION BY user_id) as total_orders FROM orders WHERE order_date >= '2024-01-01';

WITH filtered_orders AS (SELECT * FROM orders WHERE order_date >= '2024-01-01') SELECT *, COUNT(*) OVER (PARTITION BY user_id) as total_orders FROM filtered_orders;

Window functions are applied after WHERE clause filtering, but the partitioning still considers the full dataset. Use CTEs or subqueries to pre-filter data and reduce the working set size.

Pre-filter data with CTEs before applying window functions to reduce processing overhead

Medium

Window function processes all rows before WHERE clause filtering

SELECT *, AVG(salary) OVER (PARTITION BY dept ORDER BY salary ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) FROM employees;

SELECT *, AVG(salary) OVER (PARTITION BY dept) FROM employees;

When using the entire partition (UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING), the ORDER BY clause is unnecessary and adds sorting overhead. Simple PARTITION BY is sufficient for aggregate calculations over the entire partition.

Remove ORDER BY when calculating over the entire partition

Medium

Unnecessary ORDER BY clause with full partition frame wastes resources

SELECT user_id, (SELECT COUNT(*) FROM orders o2 WHERE o2.user_id = o1.user_id) as order_count FROM orders o1;

SELECT user_id, COUNT(*) OVER (PARTITION BY user_id) as order_count FROM orders;

Correlated subqueries execute once per row, creating massive performance overhead. Window functions process all rows in a single pass and are orders of magnitude faster for this type of calculation.

Replace correlated subqueries with window functions for better performance

High

Correlated subquery creates N+1 query problem