Statistical & Advanced Aggregate Functions: Interview
Module: Aggregate Functions & Grouping
What does standard deviation measure and when would you use it?
Standard deviation measures data spread around the mean. Use it to identify variance in salaries (pay equity), quality control (product consistency), or performance analysis (employee distribution). Higher values indicate more spread.
What is the difference between PERCENTILE_CONT and PERCENTILE_DISC?
PERCENTILE_CONT interpolates between values for continuous distribution. PERCENTILE_DISC returns actual data value (discrete). Use CONT for smooth percentiles, DISC when you need actual observed values.
When should you use MEDIAN instead of AVG?
Use MEDIAN when data has outliers that would skew the average. Median is the 50th percentile, resistant to extreme values. Common in salary analysis, real estate pricing, and performance metrics with outliers.
Calculate salary quartiles and standard deviation per department
SELECT
department,
PERCENTILE_CONT(0.25) WITHIN GROUP (ORDER BY salary) AS p25,
PERCENTILE_CONT(0.50) WITHIN GROUP (ORDER BY salary) AS median,
PERCENTILE_CONT(0.75) WITHIN GROUP (ORDER BY salary) AS p75,
STDDEV(salary) AS std_dev
FROM employees
GROUP BY department;
Combines percentiles for distribution analysis with standard deviation for spread measurement. Provides comprehensive salary analysis per department.