Execution Plans & EXPLAIN: Performance
Module: Query Optimization & Performance
**Advanced Performance Analysis Techniques:**
1. **Cost Model Tuning for Your Hardware**
- Default costs assume spinning disks
- For SSDs: Set random_page_cost = 1.1 (vs default 4.0)
- For NVMe: Set random_page_cost = 1.0
- Impact: Can change optimizer from table scan to index scan
2. **Parallel Execution Optimization**
- Minimum table size for parallelism: Usually 8MB
- Worker overhead: ~10ms per worker startup
- Sweet spot: 2-4 workers for most queries
- Diminishing returns: >8 workers rarely helps
- Memory per worker: work_mem * num_workers
3. **Statistics Quality Assessment**
- Check n_distinct accuracy: SELECT n_distinct FROM pg_stats
- Histogram buckets: Default 100, increase for skewed data
- Correlation: -1 to 1, affects index vs scan choice
- Multi-column statistics: For correlated columns
4. **Plan Stability and Regression Detection**
- Monitor plan changes in production
- Alert on sudden cost increases (>2x)
- Track query fingerprints and plan hashes
- Use Query Store (SQL Server) or pg_stat_statements
5. **Memory Pressure Indicators**
- PostgreSQL: "Disk" in buffers output
- MySQL: "Using temporary" with large row counts
- SQL Server: "Sort Warnings" or "Hash Warnings"
- Fix: Increase work_mem, sort_buffer_size, or add indexes
6. **Join Order Optimization**
- Optimizer tries limited permutations (usually 12 tables max)
- For >12 tables, may not find optimal order
- Use STRAIGHT_JOIN (MySQL) or join_collapse_limit (PostgreSQL)
- Manually order: smallest filtered table first
7. **Subquery Materialization**
- Optimizer may materialize subqueries
- Look for "Materialize" or "Subquery Scan" nodes
- Cost: One-time materialization + scan cost
- Alternative: Rewrite as JOIN or CTE
8. **Index-Only Scans**
- Best performance: All columns in index
- PostgreSQL: Requires VACUUM for visibility map