Table Partitioning Strategies: Real-World
Module: Query Optimization & Performance
Orders table (1B rows, 500GB) - queries scan entire table, take 5 minutes. Partition by month (12 partitions). Query for last month scans 1 partition (8% of data), takes 20 seconds (15x faster). Archive old partitions by dropping them (instant vs hours of DELETE). Logs table (10B rows) - partition by day, drop old partitions daily, keep table size constant.
Uber: Trips Table Partitioning - 20x Query Speedup
Uber trips table (10B rows, 5TB) with queries filtering by trip date. Non-partitioned table scans all 10B rows, queries take 10 minutes. Partition by month (36 partitions for 3 years). Query for last month scans 1 partition (3% of data), takes 30 seconds (20x faster). Archive old partitions by dropping them (instant vs 2 days DELETE).
Implement range partitioning by trip_date with monthly partitions. Create partitions for last 36 months + next 3 months. Automate partition creation with cron job (1st of each month). Automate partition archiving - drop partitions older than 3 years. Queries now include date range in WHERE clause, enabling partition pruning. Result: 20x faster queries, instant archiving, table size stays constant.
-- Problem: Non-partitioned trips table
-- 10B rows, 5TB, queries scan entire table
-- Query performance before partitioning
EXPLAIN ANALYZE
SELECT driver_id, COUNT(*) as trip_count, SUM(fare) as total_fare
FROM trips
WHERE trip_date >= '2024-01-01' AND trip_date < '2024-02-01'
GROUP BY driver_id;
-- Seq Scan on trips (10B rows)
-- Time: 600 seconds (10 minutes)
-- Solution: Partition by month
CREATE TABLE trips_partitioned (
trip_id BIGSERIAL,
driver_id BIGINT NOT NULL,
rider_id BIGINT NOT NULL,
trip_date DATE NOT NULL,
fare DECIMAL(10,2),
PRIMARY KEY (trip_id, trip_date)
) PARTITION BY RANGE (trip_date);
-- Create 36 monthly partitions (3 years)
-- Automated script:
DO $$
DECLARE
start_date DATE;
end_date DATE;
partition_name TEXT;
BEGIN
FOR i IN 0..35 LOOP
start_date := DATE_TRUNC('month', CURRENT_DATE - INTERVAL '36 months') + (i || ' months')::INTERVAL;
end_date := start_date + INTERVAL '1 month';
partition_name := 'trips_' || TO_CHAR(start_date, 'YYYY_MM');
EXECUTE format(
'CREATE TABLE %I PARTITION OF trips_partitioned FOR VALUES FROM (%L) TO (%L)',
partition_name, start_date, end_date
);
END LOOP;