Normalization (1NF to BCNF): Real-World
Module: Schema Design & Advanced DDL
Stripe normalizes customer data to 3NF: customers table (id, name, email), addresses table (id, customer_id, street, city), payment_methods table (id, customer_id, type, last4). Prevents update anomalies (change email once, not in 100 orders). Amazon normalizes product catalog: products table, categories table, product_categories junction table. Prevents deletion anomalies (delete product, keep category). Netflix normalizes viewing history: users table, movies table, viewing_history table (user_id, movie_id, watched_at). Prevents insertion anomalies (add movie without user watching it). Banks normalize account data to BCNF for regulatory compliance. E-commerce sites normalize to 3NF for inventory management.
Stripe: Normalize Customer Data to 3NF for Payment Processing
Stripe processes 1 billion payment transactions daily. Initial design stored customer data (name, email, address) in payments table. Problem: Customer data duplicated in millions of payment rows, update customer email requires updating millions of rows (update anomaly), cannot add customer without payment (insertion anomaly). Solution: Normalize to 3NF - separate customers, addresses, payment_methods, payments tables.
Stripe normalization: (1) customers table (id, name, email, created_at), (2) addresses table (id, customer_id, street, city, state, zip, country), (3) payment_methods table (id, customer_id, type, last4, exp_month, exp_year), (4) payments table (id, customer_id, payment_method_id, amount, status, created_at). Benefits: Update customer email once (not in millions of payments), add customer without payment, delete payment but keep customer data. Performance: Index customer_id in payments table for fast joins (< 10ms). Real-world: Stripe processes 30K+ payments per second with normalized schema.
-- Stripe normalized schema (simplified)
CREATE TABLE customers (
customer_id VARCHAR(50) PRIMARY KEY,
email VARCHAR(100) UNIQUE NOT NULL,
name VARCHAR(100),
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
CREATE TABLE addresses (
address_id VARCHAR(50) PRIMARY KEY,
customer_id VARCHAR(50) REFERENCES customers(customer_id) ON DELETE CASCADE,
street VARCHAR(200),
city VARCHAR(100),
state VARCHAR(50),
zip VARCHAR(20),
country VARCHAR(50),
is_default BOOLEAN DEFAULT false
);
CREATE TABLE payment_methods (
payment_method_id VARCHAR(50) PRIMARY KEY,
customer_id VARCHAR(50) REFERENCES customers(customer_id) ON DELETE CASCADE,
type VARCHAR(20), -- card, bank_account
last4 VARCHAR(4),
exp_month INT,
exp_year INT,
is_default BOOLEAN DEFAULT false
);
CREATE TABLE payments (
payment_id VARCHAR(50) PRIMARY KEY,
customer_id VARCHAR(50) REFERENCES customers(customer_id),
payment_method_id VARCHAR(50) REFERENCES payment_methods(payment_method_id),
amount DECIMAL(10,2) NOT NULL,
currency VARCHAR(3) DEFAULT 'USD',
status VARCHAR(20), -- succeeded, failed, pending
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Indexes for performance