Statistics & Cardinality: Performance
Module: Query Optimization & Performance
**Impact of Estimation Errors:**
**Scenario 1: 10x Underestimate**
Estimate: 10K rows → choose nested loop
Actual: 100K rows
Nested loop cost:
- Estimated: 10K × 100 = 1M operations
- Actual: 100K × 100 = 10M operations
- 10x slower than expected
Should have chosen hash join:
- Cost: 100K + 100K = 200K operations
- 50x faster than nested loop
**Scenario 2: 100x Overestimate**
Estimate: 10M rows → allocate 1GB memory
Actual: 100K rows → only need 10MB
Impact:
- Wasted 990MB memory
- Other queries starved for memory
- May cause memory spills in other queries
**Scenario 3: Compounding Errors in 5-Table Join**
Each table has 2x estimation error:
- Total error: 2^5 = 32x
- Estimated: 100K rows
- Actual: 3.2M rows
- Wrong join order chosen
- Query 100x slower
**When to Investigate:**
- Estimation error >10x: Fix immediately
- Estimation error 2-10x: Investigate if query is slow
- Estimation error <2x: Acceptable
**Cost of Statistics Updates:**
Small table (10K rows):
- ANALYZE time: 5ms
- Update frequency: After every bulk change
Medium table (1M rows):
- ANALYZE time: 500ms
- Update frequency: After 100K+ row changes
Large table (100M rows):
- ANALYZE time: 30 seconds
- Update frequency: Daily or after 10M+ row changes
Very large table (10B rows):