Normalization (1NF to BCNF): Performance
Module: Schema Design & Advanced DDL
Normalization Performance:
Write Performance (Improved):
- Unnormalized: Update customer email in 100 order rows (100 writes)
- Normalized: Update customer email in 1 customer row (1 write)
- Normalization improves write performance (fewer rows to update)
Read Performance (Degraded):
- Unnormalized: SELECT * FROM orders (1 table scan)
- Normalized: SELECT * FROM orders JOIN customers JOIN order_items JOIN products (4 table scans + 3 joins)
- Normalization degrades read performance (more joins)
Storage (Improved):
- Unnormalized: Customer name stored in every order row (100 orders = 100 copies)
- Normalized: Customer name stored once in customers table (100 orders = 1 copy)
- Normalization saves storage (no redundancy)
Consistency (Improved):
- Unnormalized: Update customer email, might miss some order rows (inconsistent)
- Normalized: Update customer email once, all orders see new email (consistent)
- Normalization improves consistency (single source of truth)
Query Complexity (Increased):
- Unnormalized: Simple queries (SELECT * FROM orders)
- Normalized: Complex queries (JOIN multiple tables)
- Normalization increases query complexity
Indexing:
- Normalized tables: Index foreign keys for fast joins
- Example: CREATE INDEX idx_orders_customer_id ON orders(customer_id);
- Without index: JOIN scans entire table (slow)
- With index: JOIN uses index seek (fast)
Real-world Performance:
- Stripe: Normalized to 3NF, indexes all foreign keys, 30K+ transactions/sec
- Amazon: Normalized to 3NF for orders, denormalized for product catalog (read-heavy)
- Netflix: Normalized to 3NF for user data, denormalized for recommendations (read-heavy)
Optimization Strategies:
1. Normalize to 3NF by default
2. Identify slow queries (> 100ms)
3. Selectively denormalize hot paths
4. Use materialized views for expensive joins
5. Cache frequently accessed data
6. Monitor query performance continuously
Index foreign keys for fast joins: CREATE INDEX idx_orders_customer_id ON orders(customer_id)
Use EXPLAIN to analyze joins: Identify slow joins, add indexes or denormalize
Materialize expensive joins: CREATE MATERIALIZED VIEW for frequently joined tables