SQL Practice Logo

SQLPractice Online

CROSS JOIN: Interview

Module: Joins & Relationships

What is CROSS JOIN and how does it differ from INNER JOIN?

CROSS JOIN creates a cartesian product - every row from the left table is paired with every row from the right table, resulting in N × M rows. It has NO ON clause because there's no matching condition. INNER JOIN, on the other hand, requires an ON clause and only returns rows where the join condition is met (matching rows). CROSS JOIN is for generating all possible combinations, while INNER JOIN is for combining related data based on a relationship.

When would you intentionally use CROSS JOIN in a real application?

CROSS JOIN is intentionally used when you need to generate all possible combinations: 1) Product variants - combining products with sizes, colors, materials to create SKUs. 2) Calendar tables - generating date-shift combinations for scheduling systems. 3) Report dimensions - creating complete matrices (regions × quarters) even when some cells have no data. 4) Test data generation - creating comprehensive test case matrices (users × scenarios). 5) Gap analysis - generating all possible combinations, then finding which ones are missing from actual data using LEFT JOIN with WHERE IS NULL.

How do you calculate the result size of a CROSS JOIN, and why is this important?

Result size = N × M, where N is the number of rows in the left table and M is the number of rows in the right table. For multiple CROSS JOINs: N × M × P × Q. This calculation is CRITICAL before running the query because CROSS JOIN can create massive result sets that crash databases. Examples: 10×10=100 (safe), 1,000×1,000=1,000,000 (caution), 10,000×10,000=100,000,000 (danger - will likely crash). Always calculate first, and if the result is > 100,000 rows, reconsider the approach or add aggressive WHERE filtering.

What is the difference between an intentional CROSS JOIN and an accidental cartesian product?

An intentional CROSS JOIN uses explicit CROSS JOIN syntax and is used deliberately to generate all combinations for a specific purpose (product variants, calendar tables, etc.). An accidental cartesian product happens when you forget the ON clause in an INNER JOIN, creating an unintended CROSS JOIN. Both produce the same result (N × M rows), but the first is correct and purposeful, while the second is a bug that can crash the database. The key difference is intent and awareness. Always use explicit CROSS JOIN syntax when you want a cartesian product to make your intent clear.

How can you optimize a CROSS JOIN query that is running slowly?

Optimization strategies: 1) Add WHERE clause to filter tables BEFORE the cartesian product (reduce N and M). 2) Select only needed columns instead of SELECT * (reduce data transfer). 3) Ensure both tables are small (< 1,000 rows each ideally). 4) Test on small datasets first before scaling up. 5) Consider alternatives: generate combinations in application code, use recursive CTEs, or pre-compute and materialize results in a table. 6) For repeated queries, create a materialized view or permanent table with the combinations. 7) If filtering reduces one table significantly, apply that filter in a subquery or CTE first.

Explain the pattern of using CROSS JOIN with LEFT JOIN to find missing combinations.

This is a powerful gap analysis pattern: Step 1) Use CROSS JOIN to generate ALL possible combinations (the "universe" of what should exist). Step 2) Use LEFT JOIN to match against actual data. Step 3) Use WHERE right_table.id IS NULL to find combinations that don't exist in actual data. Example: CROSS JOIN products and sizes to get all possible variants, LEFT JOIN with inventory, WHERE inventory.id IS NULL finds missing stock. This pattern is essential for finding gaps in data - missing product variants, unscheduled time slots, incomplete test coverage, etc.

Write a query to generate all possible product-size-color combinations for products in the "Clothing" category.

SELECT

p.id AS product_id,

p.name AS product_name,

s.id AS size_id,

s.name AS size_name,

c.id AS color_id,

c.name AS color_name,

CONCAT(p.name, ' - ', s.name, ' - ', c.name) AS variant_description

FROM products p

CROSS JOIN sizes s

CROSS JOIN colors c

WHERE p.category = 'Clothing'

AND p.active = true

ORDER BY p.name, s.sort_order, c.name;

This query chains two CROSS JOINs to create three-way combinations. The WHERE clause filters to only Clothing category and active products, reducing the result size. The CONCAT creates a readable variant description. ORDER BY ensures logical sorting. If products has 50 clothing items, sizes has 5 options, and colors has 8 options: 50 × 5 × 8 = 2,000 variants.

Create a query to find which product-size combinations exist in the products and sizes tables but are missing from the inventory table.

SELECT

p.name AS product_name,

s.name AS size_name,

p.category,

'MISSING FROM INVENTORY' AS status

FROM products p

CROSS JOIN sizes s

LEFT JOIN inventory i

ON p.id = i.product_id

AND s.id = i.size_id

WHERE i.id IS NULL