SQL Practice Logo

SQLPractice Online

Advanced Analytics Functions: Concept

Module: Window Functions

Think of advanced analytics functions as your statistical toolkit for SQL. While basic window functions help you rank and sum, these advanced functions unlock the power of data science directly in your database. PERCENT_RANK tells you "Alice is in the top 10% of performers" - no complex calculations needed. CORR reveals hidden relationships like "customer satisfaction correlates 0.85 with repeat purchases." STDDEV measures variability - "our sales have high volatility this quarter." These functions transform raw numbers into business insights, letting you answer questions like "What percentile is this customer in?" or "How consistent are our results?" Instead of exporting data to statistical software, you get professional-grade analytics right in SQL.

**Statistical Functions - The Data Science Toolkit:**

**STDDEV (Standard Deviation):**

- Measures how spread out your data is

- Low STDDEV = consistent values (good for quality control)

- High STDDEV = high variability (risk indicator)

- Formula: √(variance) but SQL calculates it for you

**VARIANCE:**

- Measures average squared differences from mean

- Used in finance for volatility calculations

- Foundation for many statistical measures

- VARIANCE = STDDEV²

**CORR (Correlation):**

- Measures relationship between two variables (-1 to +1)

- +1 = perfect positive correlation

- -1 = perfect negative correlation

- 0 = no correlation

- Essential for recommendation systems and predictive analytics

**Distribution Functions - Understanding Position:**

**PERCENT_RANK:**

- Shows relative position as percentage (0 to 1)

- "This customer is in the 85th percentile"

- Perfect for performance benchmarking

- Formula: (rank - 1) / (total rows - 1)

**CUME_DIST (Cumulative Distribution):**

- Shows percentage of values ≤ current value

- "40% of customers spend less than this amount"

- Used for threshold analysis

- Formula: (number of rows ≤ current) / total rows

**PERCENTILE_CONT:**

- Finds exact value at specific percentile

- "What salary represents the 75th percentile?"

- Interpolates between values for precision

- Essential for setting benchmarks and targets

**Advanced Patterns:**

**Multi-Function Analysis:**

```sql

WITH stats AS (

SELECT

employee_id,