DISTINCT & ALL: Interview

Module: Aggregate Functions & Grouping

What is the difference between COUNT(*) and COUNT(DISTINCT column)?

COUNT(*) counts all rows including duplicates. COUNT(DISTINCT column) counts only unique values in that column, removing duplicates before counting.

When should you use DISTINCT vs GROUP BY?

Use DISTINCT to remove duplicate rows from simple SELECT. Use GROUP BY when you need aggregates (COUNT, SUM, AVG) per group. GROUP BY is more powerful for analytical queries.

What are the performance implications of DISTINCT?

DISTINCT requires sorting or hashing to identify duplicates, which is expensive on large datasets. Index columns used in DISTINCT. Consider if duplicates indicate data quality issues.

Count total orders and unique customers

SELECT COUNT(*) AS total_orders, COUNT(DISTINCT customer_id) AS unique_customers FROM orders;

Shows difference between total rows and unique values. Reveals how many orders per customer on average.

Find unique product categories

SELECT DISTINCT category FROM products ORDER BY category;

Returns deduplicated list of categories. Useful for dropdown lists and category navigation.