Window Functions Fundamentals: Concept
Module: Window Functions
Window functions are like having a calculator that can see the entire spreadsheet while working on each row. Unlike GROUP BY which collapses rows into summary groups, window functions keep every row and add calculated columns based on related rows. Think of analyzing sales data: GROUP BY gives you total sales per month (12 rows for a year), but window functions give you each individual sale with running totals, rankings, and comparisons (keeping all original rows). This makes them perfect for dashboards, reports, and analytics where you need both detail and context.
**
**The OVER Clause - Your Window Definition:**
Every window function needs an OVER clause that defines which rows to include in the calculation:
```sql
-- Empty OVER() = entire result set
SELECT name, salary, AVG(salary) OVER() as company_avg
FROM employees;
-- Every row shows the same company average
-- PARTITION BY = separate calculation per group
SELECT name, department, salary,
AVG(salary) OVER(PARTITION BY department) as dept_avg
FROM employees;
-- Each department has its own average
-- ORDER BY = cumulative/sequential calculations
SELECT name, hire_date, salary,
SUM(salary) OVER(ORDER BY hire_date) as running_payroll
FROM employees;
-- Running total of payroll as employees were hired
```
**Window Functions vs GROUP BY - The Key Difference:**
| Aspect | GROUP BY | Window Functions |
|--------|----------|------------------|
| **Rows Returned** | Collapsed (fewer rows) | Preserved (same count) |
| **Use Case** | Summarize data | Analyze + preserve detail |
| **Example** | "Total sales per month" | "Each sale + monthly total" |
| **Performance** | Generally faster | Slower but more flexible |
**Real Production Example - E-commerce Analytics:**
```sql
-- Business need: Show each order with context
SELECT
order_id,
customer_id,
order_date,
amount,
-- Window functions add context without collapsing rows
AVG(amount) OVER() as overall_avg_order,
SUM(amount) OVER(PARTITION BY customer_id) as customer_lifetime_value,
ROW_NUMBER() OVER(PARTITION BY customer_id ORDER BY order_date) as customer_order_sequence,
SUM(amount) OVER(ORDER BY order_date) as running_revenue