SQL Practice Logo

SQLPractice Online

Table Partitioning Strategies: Real-World

Module: Query Optimization & Performance

Orders table (1B rows, 500GB) - queries scan entire table, take 5 minutes. Partition by month (12 partitions). Query for last month scans 1 partition (8% of data), takes 20 seconds (15x faster). Archive old partitions by dropping them (instant vs hours of DELETE). Logs table (10B rows) - partition by day, drop old partitions daily, keep table size constant.

Uber: Trips Table Partitioning - 20x Query Speedup

Uber trips table (10B rows, 5TB) with queries filtering by trip date. Non-partitioned table scans all 10B rows, queries take 10 minutes. Partition by month (36 partitions for 3 years). Query for last month scans 1 partition (3% of data), takes 30 seconds (20x faster). Archive old partitions by dropping them (instant vs 2 days DELETE).

Implement range partitioning by trip_date with monthly partitions. Create partitions for last 36 months + next 3 months. Automate partition creation with cron job (1st of each month). Automate partition archiving - drop partitions older than 3 years. Queries now include date range in WHERE clause, enabling partition pruning. Result: 20x faster queries, instant archiving, table size stays constant.

-- Problem: Non-partitioned trips table

-- 10B rows, 5TB, queries scan entire table

-- Query performance before partitioning

EXPLAIN ANALYZE

SELECT driver_id, COUNT(*) as trip_count, SUM(fare) as total_fare

FROM trips

WHERE trip_date >= '2024-01-01' AND trip_date < '2024-02-01'

GROUP BY driver_id;

-- Seq Scan on trips (10B rows)

-- Time: 600 seconds (10 minutes)

-- Solution: Partition by month

CREATE TABLE trips_partitioned (

trip_id BIGSERIAL,

driver_id BIGINT NOT NULL,

rider_id BIGINT NOT NULL,

trip_date DATE NOT NULL,

fare DECIMAL(10,2),

PRIMARY KEY (trip_id, trip_date)

) PARTITION BY RANGE (trip_date);

-- Create 36 monthly partitions (3 years)

-- Automated script:

DO $$

DECLARE

start_date DATE;

end_date DATE;

partition_name TEXT;

BEGIN

FOR i IN 0..35 LOOP

start_date := DATE_TRUNC('month', CURRENT_DATE - INTERVAL '36 months') + (i || ' months')::INTERVAL;

end_date := start_date + INTERVAL '1 month';

partition_name := 'trips_' || TO_CHAR(start_date, 'YYYY_MM');

EXECUTE format(

'CREATE TABLE %I PARTITION OF trips_partitioned FOR VALUES FROM (%L) TO (%L)',

partition_name, start_date, end_date

);

END LOOP;