SQL Practice Logo

SQLPractice Online

Advanced Analytics Functions: Mistakes

Module: Window Functions

SELECT name, PERCENT_RANK() OVER (ORDER BY salary) * 100 as percentile FROM employees;

SELECT name, ROUND(PERCENT_RANK() OVER (ORDER BY salary) * 100, 1) as percentile FROM employees;

PERCENT_RANK returns values between 0 and 1. Multiply by 100 for percentage and use ROUND for clean display.

Always format PERCENT_RANK output for readability

Low

PERCENT_RANK returns decimal values with many decimal places

SELECT CORR(sales, marketing_spend) FROM monthly_data WHERE sales IS NULL OR marketing_spend IS NULL;

SELECT CORR(sales, marketing_spend) FROM monthly_data WHERE sales IS NOT NULL AND marketing_spend IS NOT NULL;

CORR function requires both values to be non-NULL. Filter out NULL values before correlation analysis.

Always handle NULL values in statistical functions

Medium

NULL values in correlation calculations return NULL result

SELECT department, STDDEV(salary) OVER (PARTITION BY department ORDER BY hire_date) FROM employees;

SELECT department, STDDEV(salary) OVER (PARTITION BY department) FROM employees;

Statistical functions like STDDEV typically need the full partition. ORDER BY creates a running window which may not be intended.

Avoid ORDER BY in statistical window functions unless you need running statistics

High

Using ORDER BY with statistical functions creates running calculations

Complex nested: PERCENT_RANK() OVER (ORDER BY (salary + AVG(salary) OVER (PARTITION BY dept)))

WITH dept_avg AS (SELECT *, AVG(salary) OVER (PARTITION BY dept) as avg_sal FROM employees) SELECT *, PERCENT_RANK() OVER (ORDER BY (salary + avg_sal)) FROM dept_avg;

Break complex calculations into CTEs for better readability, maintainability, and performance.

Use CTEs to simplify complex statistical calculations

Medium

Nested window functions are hard to read and debug

SELECT PERCENTILE_CONT(0.5) OVER (ORDER BY salary) as median FROM employees;

SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary) OVER () as median FROM employees;

PERCENTILE_CONT uses WITHIN GROUP syntax, not standard window function syntax.

Remember PERCENTILE_CONT uses WITHIN GROUP (ORDER BY) syntax

High

Incorrect syntax for PERCENTILE_CONT function