Partitioning & Sharding: Examples

Module: Schema Design & Advanced DDL

Range Partitioning by Date

intermediate

Orders table with 500M rows. Queries filtering by date scan entire table (2 minutes). Partition by year to enable partition pruning.

CREATE TABLE orders (

order_id BIGSERIAL,

order_date DATE NOT NULL,

customer_id BIGINT,

total DECIMAL(10,2)

) PARTITION BY RANGE (order_date);

CREATE TABLE orders_2022 PARTITION OF orders

FOR VALUES FROM ('2022-01-01') TO ('2023-01-01');

CREATE TABLE orders_2023 PARTITION OF orders

FOR VALUES FROM ('2023-01-01') TO ('2024-01-01');

CREATE TABLE orders_2024 PARTITION OF orders

FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');

CREATE INDEX ON orders_2022(customer_id);

CREATE INDEX ON orders_2023(customer_id);

CREATE INDEX ON orders_2024(customer_id);

SELECT * FROM orders

WHERE order_date >= '2024-01-01' AND order_date < '2024-02-01';

EXPLAIN SELECT * FROM orders WHERE order_date >= '2024-01-01';

Query scans only orders_2024 partition (50M rows instead of 500M). Query time: 2 minutes -> 10 seconds (12x faster). Index size per partition: 5GB instead of 50GB.

Partition pruning automatically skips orders_2022 and orders_2023. Database only scans orders_2024. Smaller partition means smaller index, faster scans.

PostgreSQL

graph TD

A[Query: WHERE order_date >= 2024-01-01] --> B{Partition Pruning}

B -->|Skip| C[orders_2022: 200M rows]

B -->|Skip| D[orders_2023: 200M rows]

B -->|Scan| E[orders_2024: 100M rows]

E --> F[Result in 10 seconds]

G[Without Partitioning] --> H[Scan all 500M rows]

H --> I[Result in 2 minutes]

style F fill:#90EE90

style I fill:#FFB6C6

List Partitioning by Region

intermediate

Sales data for multiple regions. Queries typically filter by region. Partition by region for faster queries and easier regional data management.

CREATE TABLE sales (

sale_id BIGSERIAL,

region VARCHAR(20) NOT NULL,