SQL Cheat Sheet: The Complete One-Page Reference for SELECT, JOIN, GROUP BY, Window Functions, CTEs and More
A reference cheat sheet you can actually use at work — every SQL clause, function family and pattern with one-liner explanations and copy-pasteable examples. Bookmark this page; it covers SELECT through recursive CTEs, links out to deep-dive lessons for every section, and notes dialect differences across PostgreSQL, MySQL and SQL Server. Print-friendly, no login, no PDF gate.
Last updated · SQL Practice Online team
The logical execution order of a SELECT statement
You write a SELECT in one order, but the database executes it in a different order. Knowing the execution order explains every "why does this not work?" question — why aliases defined in SELECT cannot be used in WHERE, why GROUP BY columns must appear in SELECT, why DISTINCT happens before ORDER BY but after window functions.
| Step | Clause | What it does | Visible to next step |
|---|---|---|---|
| 1 | FROM / JOIN | Build the working result set from source tables. | All source columns |
| 2 | WHERE | Filter rows before grouping. | All source columns |
| 3 | GROUP BY | Collapse rows into groups. | Grouping keys + aggregates only |
| 4 | HAVING | Filter groups (post-aggregate). | Grouping keys + aggregates only |
| 5 | SELECT | Project columns / compute expressions. | Aliases now exist |
| 6 | Window functions | Compute OVER (...) results per row. | SELECT-list aliases |
| 7 | DISTINCT | Drop duplicate rows. | Final projection |
| 8 | ORDER BY | Sort the result. | SELECT-list aliases allowed |
| 9 | LIMIT / OFFSET | Trim to the requested page. | — |
SELECT, projection, filtering and sorting
SELECT DISTINCT col1, col2, expr AS alias
FROM table_a a
JOIN table_b b ON a.id = b.a_id
WHERE col1 > 100 AND col2 IN ('x', 'y')
GROUP BY col1, col2
HAVING COUNT(*) > 5
ORDER BY col1 DESC, col2 ASC NULLS LAST
LIMIT 50 OFFSET 100;| Operator | Meaning | Example |
|---|---|---|
| =, <>, !=, <, <=, >, >= | Standard comparison | WHERE salary > 80000 |
| IN (...) | Match any of a list | WHERE country IN ('US','UK') |
| BETWEEN x AND y | Inclusive range | WHERE date BETWEEN '2026-01-01' AND '2026-12-31' |
| LIKE / ILIKE | Pattern (case-insensitive in PG) | WHERE name LIKE 'A%' |
| IS NULL / IS NOT NULL | NULL test (= NULL never works) | WHERE deleted_at IS NULL |
| EXISTS (subquery) | Row exists in correlated subquery | WHERE EXISTS (SELECT 1 FROM ...) |
| ~ / !~ (PostgreSQL) | POSIX regex match / not match | WHERE email ~ '^[a-z]+@' |
Joins
| Join | Returns | Common use |
|---|---|---|
| INNER JOIN | Rows where match exists in both tables. | Most common — orders + customers. |
| LEFT [OUTER] JOIN | All left rows + matched right rows (NULL where no match). | "...with their X if any" pattern. |
| RIGHT [OUTER] JOIN | All right rows + matched left rows. | Rare — usually flip to LEFT JOIN. |
| FULL [OUTER] JOIN | All rows from both sides; NULL where unmatched. | Reconciliation between two systems. |
| CROSS JOIN | Cartesian product (every left × every right). | Date spines, calendar tables. |
| SELF JOIN | Table joined to itself. | Manager chains, sibling rows. |
| ANTI-JOIN (LEFT JOIN ... WHERE ... IS NULL) | Rows in A not in B. | "Customers who never ordered." |
| SEMI-JOIN (EXISTS) | Rows in A where match exists in B (no duplicates). | Often faster than JOIN + DISTINCT. |
-- INNER
SELECT * FROM orders o
JOIN customers c ON c.id = o.customer_id;
-- LEFT (keep all customers, even those with no orders)
SELECT * FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id;
-- ANTI (customers who never ordered)
SELECT c.* FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
WHERE o.id IS NULL;
-- SEMI (customers who ordered at least once, no dupes)
SELECT * FROM customers c
WHERE EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.id);Deep dive: see /learn/sql-joins for the complete guide with worked examples, performance notes, and 15 hands-on lessons.
Aggregates and GROUP BY
| Function | Returns | NULL handling |
|---|---|---|
| COUNT(*) | Row count, including NULLs. | Counts all rows. |
| COUNT(col) | Count of non-NULL values in col. | Skips NULLs. |
| COUNT(DISTINCT col) | Count of distinct non-NULL values. | Skips NULLs. |
| SUM, AVG | Sum / average of non-NULLs. | Skips NULLs; returns NULL if no rows. |
| MIN, MAX | Min / max of non-NULLs. | Works on numbers, dates, strings. |
| STRING_AGG / GROUP_CONCAT / LISTAGG | Concat strings within a group. | Postgres / MySQL / Oracle naming. |
| ARRAY_AGG / JSON_AGG | Aggregate into array / JSON. | PostgreSQL. |
| STDDEV, VAR_POP, PERCENTILE_CONT | Statistical aggregates. | Useful for analytics. |
Window functions
fn() OVER (
PARTITION BY <reset-cols>
ORDER BY <sequence-cols>
ROWS BETWEEN <start> AND <end>
)| Family | Functions | Use it for |
|---|---|---|
| Ranking | ROW_NUMBER, RANK, DENSE_RANK, NTILE | Top-N per group, dedup, percentiles. |
| Offset | LAG, LEAD, FIRST_VALUE, LAST_VALUE, NTH_VALUE | Period-over-period, prev/next row. |
| Aggregate as window | SUM, AVG, COUNT, MIN, MAX with OVER() | Running totals, moving averages. |
| Distribution | CUME_DIST, PERCENT_RANK, PERCENTILE_CONT | Percentile scoring. |
-- Top 3 per group
SELECT * FROM (
SELECT *, RANK() OVER (PARTITION BY group_id ORDER BY score DESC) r
FROM t
) x WHERE r <= 3;
-- Period-over-period
SELECT month, revenue,
revenue - LAG(revenue) OVER (ORDER BY month) AS mom_delta
FROM monthly;
-- Running total
SELECT date, amount,
SUM(amount) OVER (ORDER BY date
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)
FROM tx;
-- 7-day moving average
SELECT date, AVG(value) OVER (
ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW
) FROM daily;
-- Dedupe to latest
SELECT * FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY key ORDER BY updated_at DESC) rn
FROM t
) x WHERE rn = 1;Deep dive: /learn/window-functions covers OVER, PARTITION BY, frames (ROWS vs RANGE), all 12 standard window functions, and dialect notes for PostgreSQL, MySQL 8+, SQL Server and Oracle.
CTEs and subqueries
WITH active_users AS (
SELECT id FROM users WHERE last_seen_at > NOW() - INTERVAL '30 days'
),
recent_orders AS (
SELECT * FROM orders
WHERE created_at > NOW() - INTERVAL '30 days'
)
SELECT u.id, COUNT(o.id) AS order_count
FROM active_users u
LEFT JOIN recent_orders o ON o.user_id = u.id
GROUP BY u.id;WITH RECURSIVE chain AS (
SELECT employee_id, manager_id, 1 AS depth
FROM employees
WHERE employee_id = :start_id
UNION ALL
SELECT e.employee_id, e.manager_id, c.depth + 1
FROM employees e
JOIN chain c ON e.employee_id = c.manager_id
)
SELECT * FROM chain;String, date and number functions
| Function | Returns | Example |
|---|---|---|
| LENGTH(s) / CHAR_LENGTH(s) | Character count. | LENGTH('abc') = 3 |
| LOWER(s) / UPPER(s) | Case conversion. | LOWER('SQL') = 'sql' |
| TRIM(s) / LTRIM / RTRIM | Strip whitespace. | TRIM(' x ') = 'x' |
| SUBSTRING(s, start, len) | Slice. | SUBSTRING('abcdef', 2, 3) = 'bcd' |
| POSITION(sub IN s) / INSTR | 1-based index of first match (0 if absent). | POSITION('@' IN email) |
| REPLACE(s, from, to) | Replace all occurrences. | REPLACE('a,b,c', ',', '|') |
| SPLIT_PART(s, sep, n) / SUBSTRING_INDEX | Get the n-th separated chunk. | SPLIT_PART(email, '@', 2) |
| CONCAT(a,b,c) or a || b | Concatenate (|| is SQL-standard). | first || ' ' || last |
| REGEXP_REPLACE / REGEXP_MATCHES | POSIX regex. | REGEXP_REPLACE(s, '[0-9]+', '#') |
| Function | Returns | Example |
|---|---|---|
| NOW() / CURRENT_TIMESTAMP | Current timestamp with timezone. | INSERT ... created_at = NOW() |
| CURRENT_DATE | Today (no time). | WHERE order_date = CURRENT_DATE |
| DATE_TRUNC(unit, ts) | Round down to unit. | DATE_TRUNC('month', order_date) |
| EXTRACT(field FROM ts) | Pull year, month, hour, dow, etc. | EXTRACT(YEAR FROM ts) |
| ts + INTERVAL '7 days' | Date arithmetic. | WHERE ts >= NOW() - INTERVAL '7 days' |
| AGE(a, b) (Postgres) | Symbolic interval between dates. | AGE(NOW(), birthdate) |
| DATEDIFF(a, b) (MySQL/SQLServer) | Integer day difference. | DATEDIFF(end_date, start_date) |
| TO_CHAR(ts, fmt) | Format as string. | TO_CHAR(ts, 'YYYY-MM-DD') |
INSERT, UPDATE, DELETE, MERGE
-- INSERT (single + multi-row)
INSERT INTO users (email, name) VALUES ('a@b', 'A'), ('c@d', 'C');
-- INSERT ... SELECT
INSERT INTO archive (id, payload)
SELECT id, payload FROM events WHERE created_at < '2025-01-01';
-- UPSERT (PostgreSQL)
INSERT INTO daily (date, count) VALUES (CURRENT_DATE, 1)
ON CONFLICT (date) DO UPDATE SET count = daily.count + 1;
-- UPDATE with FROM (Postgres) / JOIN (MySQL/SQL Server)
UPDATE orders o
SET status = 'fulfilled'
FROM shipments s
WHERE s.order_id = o.id AND s.shipped_at IS NOT NULL;
-- DELETE
DELETE FROM sessions WHERE expires_at < NOW();Indexes, EXPLAIN and performance basics
- B-tree index: the default; supports =, <, >, BETWEEN, ORDER BY. Covers prefix of multi-column index.
- Hash index: equality only; rarely chosen by planner over B-tree.
- GIN / GiST (PostgreSQL): full-text, JSONB containment, geospatial.
- Partial index: WHERE clause inside CREATE INDEX — index only the rows that match a predicate.
- Covering index (INCLUDE): adds non-key columns so the planner does not need a heap lookup.
-- Standard composite index — leftmost prefix matters
CREATE INDEX idx_orders_cust_date ON orders (customer_id, order_date DESC);
-- Partial index — only active rows
CREATE INDEX idx_users_active ON users (last_seen_at)
WHERE deleted_at IS NULL;
-- Covering index (PostgreSQL 11+)
CREATE INDEX idx_orders_cover
ON orders (customer_id) INCLUDE (status, total);
-- Inspect the plan
EXPLAIN (ANALYZE, BUFFERS) SELECT ... ;Transactions, locks and isolation levels
| Level | Dirty read | Non-repeatable read | Phantom read | Default in |
|---|---|---|---|---|
| READ UNCOMMITTED | Allowed | Allowed | Allowed | Rare; SQL Server NOLOCK hint. |
| READ COMMITTED | Prevented | Allowed | Allowed | PostgreSQL, SQL Server, Oracle |
| REPEATABLE READ | Prevented | Prevented | Allowed (Postgres prevents in practice) | MySQL InnoDB |
| SERIALIZABLE | Prevented | Prevented | Prevented | Strictest; uses SSI in Postgres. |
BEGIN;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
-- ... read + write ...
COMMIT; -- or ROLLBACK; on retryable serialization errorDialect quick reference: PostgreSQL vs MySQL vs SQL Server
| Need | PostgreSQL | MySQL 8+ | SQL Server |
|---|---|---|---|
| Limit rows | LIMIT n | LIMIT n | TOP n / OFFSET FETCH |
| String concat | a || b | CONCAT(a, b) | a + b / CONCAT |
| Current date | CURRENT_DATE | CURDATE() | CAST(GETDATE() AS DATE) |
| Date diff (days) | (a - b) | DATEDIFF(a, b) | DATEDIFF(day, b, a) |
| Auto-increment | GENERATED ... AS IDENTITY / SERIAL | AUTO_INCREMENT | IDENTITY(1,1) |
| Upsert | ON CONFLICT DO UPDATE | ON DUPLICATE KEY UPDATE | MERGE |
| Boolean | BOOLEAN (true/false) | TINYINT(1) (0/1) | BIT (0/1) |
| Group concat | STRING_AGG(s, ',') | GROUP_CONCAT(s) | STRING_AGG(s, ',') |
| Regex match | s ~ 'pattern' | REGEXP | — (no regex; use LIKE / CLR) |
Where to go next
This page is a quick reference; for any topic where you want depth instead of breadth, follow the deep-dive links: /learn/sql-joins for joins, /learn/window-functions for OVER and frames, /learn/sql-interview-questions-guide for curated interview problems mapped to every section above. The full Learning Mode is at /learning, and 200+ runnable practice problems are at /sql-interview-questions, /scenario and /practice.
Practice this in the editor
- Browse all SQL practice problems/sql-interview-questions
- SELECT scenarios (15)/scenario/select-statements
- WHERE clause scenarios (15)/scenario/where-clause
- Aggregations scenarios (20)/scenario/aggregations
- Joins scenarios (25)/scenario/joins
- Ranking & nth-value (15)/scenario/ranking-nth-value
- CTEs & window functions (10)/scenario/ctes-window
- Self-joins & hierarchical (15)/scenario/self-joins-hierarchical
- Date operations (15)/scenario/date-operations
- Subqueries (12)/scenario/subqueries
- Practice on HR schema/practice/hr
- Practice on E-commerce schema/practice/ecommerce
- Practice on Banking schema/practice/banking
- Practice on School schema/practice/school
- Full Learning Mode curriculum/learning
Frequently asked questions
What is the logical execution order of a SELECT statement?
FROM/JOIN → WHERE → GROUP BY → HAVING → SELECT → window functions → DISTINCT → ORDER BY → LIMIT. This is why aliases defined in SELECT cannot be used in WHERE (WHERE runs first), why GROUP BY columns must appear in SELECT, and why DISTINCT happens after window functions but before ORDER BY.
What is the difference between WHERE and HAVING?
WHERE filters rows before grouping; HAVING filters groups after aggregation. You cannot use aggregate functions (SUM, COUNT, etc.) in WHERE — they have not been computed yet — and you cannot reference non-grouped, non-aggregate columns in HAVING. If you find yourself writing HAVING without GROUP BY, you almost always want WHERE instead.
When should I use a CTE vs a subquery?
Use a CTE (WITH clause) when the query has more than one transformation step, when the same subquery is referenced more than once, or when readability matters (interviews, code review). Use an inline subquery for a one-shot filter (WHERE id IN (SELECT ...)) where promoting it to a CTE would add noise without clarity. Modern planners (PostgreSQL 12+, SQL Server, MySQL 8) inline non-recursive CTEs, so there is no performance difference in most cases.
Why does = NULL not work in SQL?
NULL means "unknown", and an unknown value compared to anything (including itself) returns NULL, which is treated as not-true by WHERE. Use IS NULL and IS NOT NULL for null tests. The same trap hits NOT IN: if the subquery contains any NULL, the whole NOT IN evaluates to NULL and returns zero rows — use NOT EXISTS or filter NULLs out.
What is the difference between RANK and DENSE_RANK?
Both assign the same rank to tied rows, but they handle the next rank differently. RANK skips numbers after a tie (1, 2, 2, 4) — "Olympic" ranking. DENSE_RANK does not skip (1, 2, 2, 3) — compact ranking. Use ROW_NUMBER if you need a unique sequence with ties broken arbitrarily.
How do I find duplicates in a SQL table?
GROUP BY the dedup key with HAVING COUNT(*) > 1 to find duplicate keys. To delete duplicates keeping the most recent record, use ROW_NUMBER() OVER (PARTITION BY dedup_key ORDER BY updated_at DESC) AS rn inside a CTE, then DELETE WHERE rn > 1.