Statistics & Query Planning: Performance
Module: Query Optimization & Performance
**Statistics Update Cost:**
Small table (1K rows):
- ANALYZE time: 10ms
- Impact: negligible
- Update frequency: after every change
Medium table (1M rows):
- ANALYZE time: 500ms
- Impact: low
- Update frequency: after 10K+ row changes
Large table (100M rows):
- ANALYZE time: 30 seconds
- Impact: moderate
- Update frequency: after 1M+ row changes or daily
Very large table (1B+ rows):
- ANALYZE time: 5+ minutes
- Impact: high
- Update frequency: weekly or after major ETL
- Consider: ANALYZE with sampling
**Sampling Trade-offs:**
Full scan (ANALYZE with FULLSCAN):
- Pros: Most accurate statistics
- Cons: Slow on large tables
- Use: Critical tables, after major changes
Sampling (default):
- Pros: Fast, good enough for most cases
- Cons: May miss rare values
- Use: Regular maintenance
PostgreSQL default: 300 * default_statistics_target pages
- default_statistics_target = 100
- Samples 30,000 pages (~240MB)
**Impact of Stale Statistics:**
Scenario: 10-table join query
- Each table has 10% cardinality error
- Compounding error: 1.1^10 = 2.6x total error
- Result: Wrong join order, 10x slower
Scenario: Memory allocation
- Estimate: 1M rows → allocate 100MB
- Actual: 10M rows → need 1GB
- Result: Memory spill to disk, 100x slower
**When to Update Statistics:**