Denormalization Strategies: Real-World

Module: Schema Design & Advanced DDL

You're at an e-commerce company where the product listing page loads in 3 seconds because every query joins 5 tables to show product name, category, brand, and inventory count. Your normalized schema is textbook perfect, but customers are leaving before the page loads.

After denormalizing by adding redundant category_name and brand_name columns to the products table, the same page loads in 200ms. You eliminated 2 joins. But now when a category name changes, you need to update potentially 10,000 product rows instead of 1 category row.

This is the denormalization trade-off: faster reads, slower writes, more storage, and consistency challenges. Companies like Amazon make this trade-off thousands of times across their systems. A product detail page might denormalize 20 different attributes to avoid joins, because showing a product fast matters more than the extra work when updating product data.

Stripe denormalizes customer email addresses into their transactions table. Every transaction stores the customer's email at the time of purchase. This means they can query transactions without joining to customers, and they have a historical record even if the customer changes their email later. The trade-off: 50 million extra email strings stored, but transaction queries run 10x faster.

Netflix denormalizes viewing history with show titles and thumbnail URLs. When you see "Continue Watching," that query hits one table, not three. When a show's title changes (rare), they update millions of viewing history rows. The trade-off is worth it because viewing history queries happen billions of times per day.

Stripe: Denormalizing Customer Data in Transactions

Stripe processes billions of payment transactions. Every transaction needs customer email for receipts and dispute resolution. Originally, transactions table had customer_id and joined to customers table. With 1B transactions and 100M customers, this join was expensive. Transaction queries took 800ms on average.

Stripe denormalized by storing customer email, name, and address directly in the transactions table at the time of transaction. This creates a historical snapshot. Even if a customer changes their email later, the transaction record shows the email used at purchase time. This is critical for compliance and dispute resolution.

-- Denormalized transactions table

CREATE TABLE transactions (

transaction_id VARCHAR(50) PRIMARY KEY,

customer_id VARCHAR(50),

-- Denormalized customer data (snapshot at transaction time)

customer_email VARCHAR(255),

customer_name VARCHAR(200),

customer_address TEXT,

amount DECIMAL(12,2),

currency VARCHAR(3),

created_at TIMESTAMP

);

-- Insert transaction with customer snapshot

INSERT INTO transactions (

transaction_id, customer_id, customer_email, customer_name,

customer_address, amount, currency, created_at

)

SELECT

'txn_' || gen_random_uuid(),

c.customer_id,

c.email,

c.name,

c.billing_address,

100.00,

'USD',

CURRENT_TIMESTAMP

FROM customers c

WHERE c.customer_id = 'cus_123';

-- Query transactions without join

SELECT

transaction_id,

customer_email,