Advanced Analytics Functions: Mistakes
Module: Window Functions
SELECT name, PERCENT_RANK() OVER (ORDER BY salary) * 100 as percentile FROM employees;
SELECT name, ROUND(PERCENT_RANK() OVER (ORDER BY salary) * 100, 1) as percentile FROM employees;
PERCENT_RANK returns values between 0 and 1. Multiply by 100 for percentage and use ROUND for clean display.
Always format PERCENT_RANK output for readability
Low
PERCENT_RANK returns decimal values with many decimal places
SELECT CORR(sales, marketing_spend) FROM monthly_data WHERE sales IS NULL OR marketing_spend IS NULL;
SELECT CORR(sales, marketing_spend) FROM monthly_data WHERE sales IS NOT NULL AND marketing_spend IS NOT NULL;
CORR function requires both values to be non-NULL. Filter out NULL values before correlation analysis.
Always handle NULL values in statistical functions
Medium
NULL values in correlation calculations return NULL result
SELECT department, STDDEV(salary) OVER (PARTITION BY department ORDER BY hire_date) FROM employees;
SELECT department, STDDEV(salary) OVER (PARTITION BY department) FROM employees;
Statistical functions like STDDEV typically need the full partition. ORDER BY creates a running window which may not be intended.
Avoid ORDER BY in statistical window functions unless you need running statistics
High
Using ORDER BY with statistical functions creates running calculations
Complex nested: PERCENT_RANK() OVER (ORDER BY (salary + AVG(salary) OVER (PARTITION BY dept)))
WITH dept_avg AS (SELECT *, AVG(salary) OVER (PARTITION BY dept) as avg_sal FROM employees) SELECT *, PERCENT_RANK() OVER (ORDER BY (salary + avg_sal)) FROM dept_avg;
Break complex calculations into CTEs for better readability, maintainability, and performance.
Use CTEs to simplify complex statistical calculations
Medium
Nested window functions are hard to read and debug
SELECT PERCENTILE_CONT(0.5) OVER (ORDER BY salary) as median FROM employees;
SELECT PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY salary) OVER () as median FROM employees;
PERCENTILE_CONT uses WITHIN GROUP syntax, not standard window function syntax.
Remember PERCENTILE_CONT uses WITHIN GROUP (ORDER BY) syntax
High
Incorrect syntax for PERCENTILE_CONT function