SQL CTEs and the WITH Clause: The Complete Guide to Common Table Expressions, Recursive Queries and Query Decomposition
A Common Table Expression — the WITH clause — is the single most important readability tool in modern SQL. It lets you name an intermediate result, reuse it, and decompose a 200-line query into half a dozen labelled steps that read top-to-bottom like prose. With the RECURSIVE variant, CTEs also become the standard way to traverse hierarchies, walk graphs, and generate sequences without a loop. This guide covers WITH syntax, multiple and chained CTEs, recursive CTEs, materialization behaviour across PostgreSQL/MySQL/SQL Server/Oracle, and when a CTE is the wrong choice (spoiler: when you need an index on the intermediate result).
Last updated · SQL Practice Online team
What is a CTE?
A Common Table Expression (CTE) is a named, temporary result set that exists for the duration of a single statement. You introduce it with the WITH keyword, give it a name, and then refer to that name like a table in the query that follows. CTEs do not create database objects, do not persist beyond the statement, and are not visible to other sessions — they exist purely to give intermediate results a name.
The mental model is "let-binding for SQL". Instead of nesting subqueries three layers deep, you hoist each step out, name it, and reference the name. The query that consumed five subqueries written inline becomes a top-to-bottom story: WITH active_customers AS (...), monthly_totals AS (...), ranked AS (...) SELECT * FROM ranked WHERE rn = 1.
WITH clause: anatomy of a CTE
The WITH clause comes immediately before the SELECT (or INSERT / UPDATE / DELETE) it modifies. It contains one or more comma-separated CTE definitions, each with a name, an optional column list, and a parenthesised query body.
-- Single CTE
WITH recent_orders AS (
SELECT customer_id, order_id, total
FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL '30 days'
)
SELECT customer_id, SUM(total) AS spend_30d
FROM recent_orders
GROUP BY customer_id
ORDER BY spend_30d DESC;
-- Multiple CTEs (comma-separated, no second WITH)
WITH active_customers AS (
SELECT customer_id FROM customers WHERE status = 'active'
),
recent_orders AS (
SELECT customer_id, total
FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL '30 days'
)
SELECT ac.customer_id, SUM(ro.total) AS spend_30d
FROM active_customers ac
JOIN recent_orders ro USING (customer_id)
GROUP BY ac.customer_id;- A WITH block can define any number of CTEs separated by commas — only the first one carries the WITH keyword.
- A later CTE may reference earlier ones (forward references are not allowed).
- The optional column list AS (col1, col2, ...) renames the output columns; useful when the inner SELECT uses computed expressions.
- A CTE is referenced in the outer query exactly like a table or view: in FROM, in JOIN, in EXISTS, anywhere a relation is allowed.
Multiple, chained, and reused CTEs
The point of a CTE is composition. You can reference an earlier CTE from a later one, you can reference the same CTE multiple times in the outer query, and you can build a pipeline that reads top-to-bottom as a sequence of named steps.
WITH
active_customers AS (
SELECT customer_id, country
FROM customers
WHERE status = 'active'
),
customer_totals AS (
-- references active_customers
SELECT ac.customer_id, ac.country, COALESCE(SUM(o.total), 0) AS lifetime_value
FROM active_customers ac
LEFT JOIN orders o USING (customer_id)
GROUP BY ac.customer_id, ac.country
),
ranked AS (
-- references customer_totals
SELECT *, RANK() OVER (PARTITION BY country ORDER BY lifetime_value DESC) AS rnk
FROM customer_totals
)
SELECT country, customer_id, lifetime_value
FROM ranked
WHERE rnk <= 3
ORDER BY country, rnk;Recursive CTEs: traversing hierarchies and generating sequences
A recursive CTE is the SQL standard way to express anything that needs iteration: walking an org chart, expanding a bill-of-materials, finding the path between two nodes, generating a date series, or doing graph traversal. The syntax adds the RECURSIVE keyword (mandatory in PostgreSQL, optional in SQL Server and Oracle, required in MySQL 8) and a UNION ALL between two halves: the anchor (base case) and the recursive term (which references the CTE itself).
-- Org chart: every employee with their depth from the CEO
WITH RECURSIVE org_tree AS (
-- Anchor: the root (CEO has no manager)
SELECT employee_id, name, manager_id, 1 AS depth
FROM employees
WHERE manager_id IS NULL
UNION ALL
-- Recursive term: join the previous level back to employees
SELECT e.employee_id, e.name, e.manager_id, ot.depth + 1
FROM employees e
JOIN org_tree ot ON e.manager_id = ot.employee_id
)
SELECT depth, name FROM org_tree ORDER BY depth, name;
-- Generate a date series (Postgres syntax — many DBs have a built-in for this)
WITH RECURSIVE dates AS (
SELECT DATE '2026-01-01' AS d
UNION ALL
SELECT d + INTERVAL '1 day' FROM dates WHERE d < DATE '2026-01-31'
)
SELECT * FROM dates;- The anchor query runs first, exactly once. Its rows form the initial working set.
- The recursive term runs repeatedly. On each iteration it joins to the previous iteration's rows, producing a new set of rows; the CTE accumulates the union of all iterations.
- Recursion stops when the recursive term returns zero rows. If it never does, you have an infinite loop — guard with a depth column or LIMIT.
- Use UNION ALL, not UNION, in the recursive term — UNION (with deduplication) is much slower and rarely necessary.
- Cycles in the data (A → B → A) cause infinite loops. Track visited nodes in a path array (PostgreSQL) or with a depth limit.
CTE vs subquery vs derived table
Logically, a non-recursive CTE, an inline view (derived table) and a correlated subquery can all express the same thing. The choice is about readability, reuse, and — sometimes — optimiser behaviour.
- CTE — best for query decomposition, multi-step logic, and any time the intermediate result will be referenced more than once or named for documentation. Cost: in some engines an "optimisation fence" that prevents predicate pushdown (see materialization section).
- Derived table (subquery in FROM) — equivalent to a CTE referenced exactly once. Slightly more compact for one-off use; loses the named-step readability of a CTE.
- Correlated subquery (in SELECT or WHERE) — best for single-value lookups (EXISTS, IN, scalar subqueries). For anything returning a relation, prefer a CTE or join.
- Temporary table (CREATE TEMP TABLE) — only when you need an index on the intermediate result, multiple statements need to see it, or the dataset is huge and you want explicit control over materialisation.
Materialization: when does the optimiser inline a CTE?
Materialisation means computing the CTE once into a temporary structure, then reading from that structure for every reference. Inlining means substituting the CTE body wherever the CTE is referenced, allowing the planner to push predicates through and merge with the outer query. Inlining is almost always faster; materialisation matters only when a CTE has side effects, is non-deterministic, or is so expensive that recomputing it is worse than caching it.
- PostgreSQL ≤ 11: CTEs are always materialised (an "optimisation fence"). PostgreSQL 12+: CTEs are inlined by default if (a) referenced once and (b) side-effect free. Use WITH foo AS MATERIALIZED (...) or AS NOT MATERIALIZED to override.
- MySQL 8: non-recursive CTEs are inlined transparently. Recursive CTEs are always materialised.
- SQL Server: CTEs are always inlined (the engine treats them as inline view definitions). To force materialisation, use a temp table or table variable.
- Oracle: CTEs are inlined unless the planner decides otherwise; the /*+ MATERIALIZE */ and /*+ INLINE */ hints let you override.
-- PostgreSQL 12+: explicitly force materialisation when a CTE is expensive
-- and referenced multiple times
WITH expensive AS MATERIALIZED (
SELECT customer_id, complicated_calculation(...) AS score
FROM customers
)
SELECT * FROM expensive WHERE score > 100
UNION ALL
SELECT * FROM expensive WHERE score < -100;
-- Force inlining (PostgreSQL 12+) even when referenced multiple times
WITH cheap AS NOT MATERIALIZED (
SELECT customer_id, country FROM customers WHERE status = 'active'
)
SELECT ... FROM cheap c1 JOIN cheap c2 ON c1.country = c2.country;Performance: where CTEs help and where they hurt
- CTEs do not get indexes. If you need to look up rows in the intermediate result by key, the engine will scan the CTE every time — a temp table with an index is faster for large intermediates referenced many times.
- Predicate pushdown only works when the CTE is inlined. If your engine materialises (PostgreSQL ≤ 11, recursive CTEs everywhere), a WHERE in the outer query will not be pushed into the CTE body — write the filter inside the CTE itself.
- Recursive CTEs are O(N × depth × recursive-term cost). Always have an index on the join column used by the recursive term, or recursion becomes quadratic.
- A CTE referenced ten times in the outer query (with default inlining) is computed ten times. If the body is expensive, force materialisation or use a temp table.
- EXPLAIN ANALYZE (Postgres) / EXPLAIN FORMAT=JSON (MySQL) / actual execution plans (SQL Server) will show whether the CTE is materialised or inlined. Trust the plan, not folklore.
Five patterns where a CTE is always the right answer
- Top-N per group — compute a ROW_NUMBER() in a CTE, filter to rn ≤ N in the outer query. Window functions cannot be filtered in WHERE, so the CTE is structurally required.
- Hierarchy traversal — recursive CTE walks the parent → child relation. Standard pattern for org charts, threaded comments, file trees, BOMs.
- Sequence / date generation — recursive CTE produces a dense series of dates or integers when no built-in generator exists.
- Multi-step aggregation — break "first filter, then aggregate, then rank, then join back" into named CTEs that read like a recipe.
- Reusable expressions — same complex expression needed in multiple SELECTs of a UNION? Define it once in a CTE and reference it from each branch.
Dialect notes: PostgreSQL, MySQL, SQL Server, Oracle, SQLite
- PostgreSQL: full support since 8.4 (2009). RECURSIVE keyword is mandatory. MATERIALIZED / NOT MATERIALIZED hints since 12 (2019).
- MySQL: CTEs added in 8.0 (2018). MySQL 5.7 has no CTE support — emulate with derived tables. RECURSIVE keyword is mandatory. cte_max_recursion_depth defaults to 1000.
- SQL Server: CTEs since 2005. RECURSIVE keyword is optional (and not commonly used). MAXRECURSION default is 100; set OPTION (MAXRECURSION 0) for unlimited.
- Oracle: WITH clause since 9i; recursive WITH since 11gR2 (2009). RECURSIVE keyword is optional. Hints: /*+ MATERIALIZE */ and /*+ INLINE */.
- SQLite: full CTE support including RECURSIVE since 3.8.3 (2014). No materialisation hints.
Practice: 6 hands-on lessons in the CTEs module
Each topic above maps to a hands-on lesson in the Learning Mode curriculum. Lessons include runnable PostgreSQL examples, exercises against the HR, E-commerce and Banking schemas, and progressive difficulty from intermediate to expert.
Practice this in the editor
Frequently asked questions
What is a CTE in SQL?
A Common Table Expression (CTE) is a named, temporary result set defined with the WITH clause. It exists only for the duration of a single statement and is referenced like a table in the query that follows. CTEs do not create database objects and are invisible to other sessions; they exist to give intermediate results a name and decompose complex queries into readable, top-to-bottom steps.
What is the difference between a CTE and a subquery?
Logically, a non-recursive CTE and a derived table (subquery in FROM) can express the same thing. The differences are practical: a CTE has a name and can be referenced multiple times in the outer query without rewriting; a subquery is anonymous and only available where it is written. On modern engines (PostgreSQL 12+, MySQL 8, SQL Server, Oracle) the planner inlines most non-recursive CTEs, so performance is identical — pick the form that reads clearest.
When should I use a recursive CTE?
Use a recursive CTE whenever you need iteration that the SQL standard does not provide elsewhere: walking a parent/child hierarchy (org chart, comment thread, file tree, bill of materials), generating a sequence of dates or integers, computing transitive closure on a graph, or producing the path between two nodes. The pattern is always: an anchor query that returns the base rows, UNION ALL, and a recursive term that joins the CTE back to the source table.
Are CTEs slower than subqueries in PostgreSQL?
They were in PostgreSQL 11 and earlier, where CTEs acted as an "optimisation fence" — predicates in the outer query could not be pushed into the CTE body. From PostgreSQL 12 onwards, non-recursive CTEs are inlined by default when referenced once and side-effect free, so a CTE and a subquery produce identical plans. You can override the default with WITH foo AS MATERIALIZED (...) or AS NOT MATERIALIZED.
Does MySQL support CTEs?
MySQL added CTEs in version 8.0 (2018), including RECURSIVE. MySQL 5.7 and earlier have no CTE support; the standard workaround is to write derived tables (SELECT in the FROM clause) or to materialise intermediate results into temporary tables. The cte_max_recursion_depth system variable defaults to 1000 in MySQL 8.
Can a CTE reference itself?
Yes — that is exactly what a recursive CTE is. The WITH RECURSIVE form (RECURSIVE keyword required in PostgreSQL and MySQL, optional in SQL Server and Oracle) consists of an anchor query, UNION ALL, and a recursive term that references the CTE name. The recursive term runs repeatedly, joining its previous output back to the source until it returns zero rows.
How do I prevent infinite loops in a recursive CTE?
Three defences: (1) make sure the recursive term has a strict termination condition — for hierarchies that means the data has no cycles; (2) carry a depth column and add a WHERE depth < N predicate; (3) for graphs that may contain cycles, accumulate a path array and exclude already-visited nodes. SQL Server limits recursion to 100 by default (override with OPTION (MAXRECURSION N)); MySQL limits it via cte_max_recursion_depth. PostgreSQL has no built-in limit so guards in the query are essential.
When should I use a temp table instead of a CTE?
Use a temporary table when you need an index on the intermediate result, when multiple statements (not just one) need to read it, or when the dataset is so large that controlling materialisation explicitly is worth the extra DDL. CTEs cannot be indexed; if your outer query repeatedly looks up rows in the CTE by key, a temp table with the right index will be much faster.