SELECT & Data Retrieval: Mistakes
Module: SQL Fundamentals
SELECT first_name last_name FROM employees;
SELECT first_name, last_name FROM employees;
Missing comma between column names. SQL parser thinks "last_name" is an alias for "first_name". Always separate column names with commas.
Format one column per line to easily spot missing commas
High
Syntax error near "last_name"
SELECT * FROM employees WHERE salary > 50000;
SELECT employee_id, first_name, last_name, salary, department FROM employees WHERE salary > 50000;
SELECT * is dangerous in production: (1) Returns unnecessary data, wasting resources (2) Breaks code when table structure changes (3) Prevents index-only scans (4) Exposes sensitive columns accidentally
Always explicitly list columns. Your future self will thank you.
Critical
No error, but bad practice
SELECT DISTINCT first_name, COUNT(*) FROM employees;
SELECT first_name, COUNT(*) FROM employees GROUP BY first_name;
DISTINCT and aggregate functions (COUNT, SUM, etc.) don't mix. Use GROUP BY for aggregations. DISTINCT removes duplicate rows AFTER all columns are selected.
Use GROUP BY for aggregations, DISTINCT for simple duplicate removal
High
Syntax error or incorrect results
SELECT * FROM employees WHERE UPPER(department) = 'SALES';
SELECT * FROM employees WHERE department = 'Sales';
Functions in WHERE clause (UPPER, LOWER, YEAR, etc.) prevent index usage. Database must calculate function for EVERY row, causing full table scan. Solution: Normalize case at INSERT time, then query directly. For 1M rows: 5000ms vs 50ms with index.
Never use functions on indexed columns in WHERE. Normalize data at insert, not at query time. Saves 100x scan time.
Critical
No error, but 100x slower - full table scan
SELECT * FROM products LIMIT 10;
SELECT * FROM products ORDER BY product_id LIMIT 10;
LIMIT without ORDER BY returns arbitrary rows - results change between queries. Database returns first 10 rows it finds (physical order), which is non-deterministic. Always use ORDER BY with LIMIT for consistent, predictable "top N" results. Critical for pagination.
Always ORDER BY indexed column before LIMIT. Ensures consistent results and enables efficient pagination.
High
No error, but unpredictable results