SQL Practice Logo

SQLPractice Online

Real-World Database Selection: Real-World

Module: Database-Specific Features

Database selection is the most critical technical decision for any project. Wrong choice = expensive migration later. Real examples: (1) Startups: Choose PostgreSQL or MySQL (proven, cheap, good ecosystem). Instagram started with PostgreSQL, scaled to 2B users. (2) Analytics: Choose PostgreSQL (window functions, CTEs) or specialized databases (Redshift, BigQuery). Airbnb uses PostgreSQL for analytics. (3) Mobile apps: Choose SQLite (embedded, no server, offline-first). WhatsApp uses SQLite for 2B users. (4) Enterprise: Choose SQL Server (Windows integration, .NET ecosystem) or Oracle (mission-critical, 24/7 support). Banks use Oracle. (5) Microservices: Choose PostgreSQL or MySQL (one database per service). Uber uses PostgreSQL for 50+ microservices. Trade-offs: PostgreSQL has better features (JSONB, arrays, full-text search) but MySQL is simpler. SQL Server has great tooling but expensive licensing. Oracle is most powerful but costs $47K/core/year. Decision factors: Budget (PostgreSQL free, Oracle $47K/core), team skills (MySQL easier than PostgreSQL), features needed (PostgreSQL for complex queries), scale (both handle billions of rows).

Instagram: PostgreSQL for 2 Billion Users

Instagram is a photo-sharing app with 2B users, 100M photos/day, 4.2B likes/day. Challenge: Choose database that scales to billions of users while maintaining fast response times. Requirements: High write throughput, high read throughput, ACID transactions, complex queries. Decision: PostgreSQL.

Instagram chose PostgreSQL over MySQL for MVCC (non-blocking reads during writes). Architecture: 1000+ PostgreSQL servers with sharding (2000 shards by user_id), read replicas (3-5 per shard), PgBouncer connection pooling (50K connections to 100 database connections), Redis caching (90% hit rate). Performance: 1.2K photos/sec writes, billions of reads/day, response time under 100ms for 99% of requests.

-- Instagram sharding by user_id

CREATE TABLE photos (

photo_id BIGSERIAL PRIMARY KEY,

user_id BIGINT,

image_url TEXT,

created_at TIMESTAMP DEFAULT NOW()

);

CREATE INDEX idx_photos_user_id ON photos(user_id);

-- Feed query with MVCC (non-blocking)

SELECT p.photo_id, p.image_url, u.username

FROM photos p

JOIN follows f ON p.user_id = f.following_id

JOIN users u ON p.user_id = u.user_id

WHERE f.follower_id = 123

ORDER BY p.created_at DESC LIMIT 20;

2B users, 100M photos/day, 4.2B likes/day

PostgreSQL MVCC for non-blocking reads

1000+ servers with sharding, 3-5 read replicas per shard

PgBouncer: 50K connections to 100 database connections

Redis: 90% cache hit rate

Performance: under 100ms for 99% of requests

PostgreSQL

Uber: MySQL to PostgreSQL Migration

Uber migrated from MySQL to PostgreSQL in 2016. Problem: MySQL locking caused 100-500ms delays in ride matching. Solution: PostgreSQL MVCC provides non-blocking reads. Result: 10x faster ride matching (10-50ms).

Uber migration process: Schema conversion (AUTO_INCREMENT to SERIAL), pgloader for data migration (1TB in 4 hours), dual-write to both databases, gradual cutover service by service, kept MySQL as backup for 2 weeks. Duration: 6 months. Result: 50K writes/sec without blocking reads, 10x faster ride matching.

-- MySQL problem: locking blocks reads

UPDATE drivers SET latitude = 37.7749 WHERE driver_id = 123;

-- Blocks: SELECT * FROM drivers WHERE status = 'available';

-- PostgreSQL solution: MVCC non-blocking

UPDATE drivers SET latitude = 37.7749 WHERE driver_id = 123;

-- Doesn't block: SELECT * FROM drivers WHERE status = 'available';

100M users, 15M trips/day, 50K writes/sec

Problem: MySQL locking caused 100-500ms delays

Solution: PostgreSQL MVCC

Migration: 6 months with dual-write

Result: 10x faster (10-50ms vs 100-500ms)