SQL Practice Logo

SQLPractice Online

Index Types Deep Dive: Real-World

Module: Query Optimization & Performance

At Uber, choosing the right index type reduced geospatial query time from 800ms to 12ms. At Reddit, GIN indexes made full-text search 40x faster. Index type selection is the difference between a slow app and a fast one.

Reddit - Full-Text Search Optimization

Full-text search across 50M posts and comments was taking 5-10 seconds. Users were frustrated with slow search results. Challenge: Migrate from B-tree LIKE queries to GIN indexes.

Migrated from B-tree with LIKE queries to GIN indexes with PostgreSQL full-text search (to_tsvector/to_tsquery). Added partial indexes for recent content (last 90 days) which covers 80% of searches.

-- Before: B-tree with LIKE (slow)

CREATE INDEX idx_content_old ON posts(content);

SELECT * FROM posts WHERE content LIKE '%postgresql%';

-- Time: 5000ms (full index scan)

-- After: GIN with full-text search (fast)

CREATE INDEX idx_content_fts ON posts

USING gin(to_tsvector('english', title || ' ' || content));

-- Partial index for recent posts (80% of queries)

CREATE INDEX idx_content_recent ON posts

USING gin(to_tsvector('english', title || ' ' || content))

WHERE created_at > NOW() - INTERVAL '90 days';

-- Query with ranking

SELECT

post_id,

title,

ts_rank(to_tsvector('english', title || ' ' || content),

to_tsquery('english', 'postgresql & performance')) as rank

FROM posts

WHERE to_tsvector('english', title || ' ' || content)

@@ to_tsquery('english', 'postgresql & performance')

AND created_at > NOW() - INTERVAL '90 days'

ORDER BY rank DESC

LIMIT 50;

-- Time: 120ms (40x faster!)

40x performance improvement (5000ms to 120ms)

Search relevance improved with ranking

Partial index reduced storage by 75%

User satisfaction increased significantly

Lessons: GIN essential for full-text, partial indexes save storage, ranking improves relevance, VACUUM regularly

PostgreSQL

Uber - Geospatial Driver Matching

Driver-rider matching was taking 800ms for "find drivers within 5km" query. With 10K requests/sec during peak, this was causing timeouts and poor user experience. Challenge: Optimize location-based queries.

Implemented GiST spatial indexes with PostGIS for geospatial queries. Added covering indexes to avoid table lookups. Used partial indexes for online drivers only.

-- Before: B-tree on lat/lon (very slow)

CREATE INDEX idx_lat ON drivers(latitude);

CREATE INDEX idx_lon ON drivers(longitude);