SQL Practice Logo

SQLPractice Online

Normalization (1NF to BCNF): Concept

Module: Schema Design & Advanced DDL

Normalization organizes database tables to reduce redundancy and prevent anomalies. Think of normalization like organizing a messy closet: unnormalized is clothes thrown everywhere (duplicates, hard to find), normalized is clothes sorted by type in labeled drawers (no duplicates, easy to find). Each normal form (1NF, 2NF, 3NF, BCNF) removes specific type of redundancy. The goal: single source of truth for each piece of data. Trade-off: normalized tables require joins to reconstruct data, but updates are simple (change once, not in 100 places).

Normalization Deep Dive:

Functional Dependencies:

Before understanding normal forms, understand functional dependencies. A functional dependency X → Y means: if you know X, you can determine Y uniquely.

Example: student_id → student_name

If you know student_id = 123, you can determine student_name = "John" uniquely. One student_id maps to exactly one student_name.

Example: order_id → customer_id, order_date

If you know order_id = 456, you can determine customer_id and order_date uniquely.

Determinant: The left side of functional dependency (X in X → Y). The attribute that determines other attributes.

Candidate Key: Minimal set of attributes that uniquely identifies a row. Can have multiple candidate keys, choose one as primary key.

Example: employees table

- employee_id → name, email, department (employee_id is candidate key)

- email → employee_id, name, department (email is also candidate key)

- Both employee_id and email can uniquely identify employee

First Normal Form (1NF):

Definition: Table is in 1NF if:

1. Each cell contains atomic (single) value (no lists, no arrays)

2. No repeating groups (no product1, product2, product3 columns)

3. Each row is unique (has primary key)

Unnormalized Example (violates 1NF):

orders table:

| order_id | customer_name | products | prices |

|----------|---------------|----------------------------|---------------|

| 1 | John | Laptop, Mouse, Keyboard | 1000, 20, 50 |

| 2 | Jane | Phone, Case | 800, 30 |

Problems:

- products column contains multiple values (not atomic)

- prices column contains multiple values (not atomic)

- Cannot query "all orders containing Laptop" easily

- Cannot add product without order (insertion anomaly)

Convert to 1NF (atomic values):

orders table:

| order_id | customer_name | product | price |

|----------|---------------|----------|-------|

| 1 | John | Laptop | 1000 |

| 1 | John | Mouse | 20 |

| 1 | John | Keyboard | 50 |

| 2 | Jane | Phone | 800 |

| 2 | Jane | Case | 30 |

Now in 1NF: