MySQL Features Deep Dive: Mistakes

Module: Database-Specific Features

Using MyISAM storage engine for transactional data (orders, payments)

Use InnoDB: ENGINE=InnoDB (default), supports transactions, foreign keys, row-level locking

MyISAM does not support transactions or foreign keys. Problem: Cannot rollback failed transactions (data inconsistency). Cannot enforce referential integrity (orphaned records). Table-level locking (poor concurrency). Example: Order processing. MyISAM: INSERT order, INSERT order_items fails, order exists without items (inconsistent). InnoDB: START TRANSACTION, INSERT order, INSERT order_items, COMMIT (both succeed) or ROLLBACK (both fail, consistent). Performance: MyISAM: Table-level locking (one write at a time). InnoDB: Row-level locking (multiple writes simultaneously). Real-world: WordPress migrated from MyISAM to InnoDB (better consistency, concurrency). Lesson: Always use InnoDB for transactional data.

Always use InnoDB (default since MySQL 5.5). Only use MyISAM for read-only data (logs, archives). Check engine: SHOW TABLE STATUS.

Critical

No transactions (cannot rollback), no foreign keys (no referential integrity), table-level locking (poor concurrency)

Not configuring InnoDB buffer pool size (default 128MB)

Set buffer pool to 70-80% of RAM: innodb_buffer_pool_size = 8G in my.cnf

InnoDB buffer pool caches data and indexes in memory. Default is 128MB (too small). Problem: Frequent disk I/O (slow), poor query performance. Solution: Set to 70-80% of RAM. Example: Server with 16GB RAM. Default: 128MB buffer pool (0.8% of RAM), frequent disk I/O. Optimized: 12GB buffer pool (75% of RAM), most data in memory (fast). Performance: Default: 1000ms query time (disk I/O). Optimized: 10ms query time (memory, 100x faster). Configuration: innodb_buffer_pool_size = 12G, innodb_buffer_pool_instances = 8 (multiple instances for concurrency). Real-world: Facebook sets buffer pool to 70% of RAM (billions of queries/day). Lesson: Always configure buffer pool.

Set buffer pool to 70-80% of RAM. Check usage: SHOW STATUS LIKE 'Innodb_buffer_pool%'. Monitor hit ratio (should be > 99%).

High

Poor performance (frequent disk I/O), slow queries, high CPU usage

Not using replication for read-heavy workloads (90% reads, 10% writes)

Use master-slave replication: 1 master for writes, N slaves for reads, load balance reads

Read-heavy workloads (90% reads) overload single database. Problem: All queries to one database (overloaded), slow queries, poor UX. Solution: Master-slave replication. Master handles writes (10%), slaves handle reads (90%). Example: Web app with 1000 queries/sec (900 reads, 100 writes). Single database: 1000 queries/sec (overloaded). Master-slave: Master 100 writes/sec, Slave 1 450 reads/sec, Slave 2 450 reads/sec (load balanced, not overloaded). Performance: Single: 500ms query time (overloaded). Master-slave: 10ms query time (load balanced, 50x faster). Real-world: Facebook uses 1 master + 100+ slaves for read scaling. Lesson: Use replication for read-heavy workloads.

Use replication for read-heavy workloads. Route writes to master, reads to slaves. Use GTID for easy failover. Monitor lag: SHOW SLAVE STATUS.

High

Single database overloaded, slow queries, poor user experience

Not indexing JSON columns (querying JSON without generated columns)

Use generated columns + indexes: ADD COLUMN brand AS (attributes->>'$.brand') STORED, CREATE INDEX

JSON columns cannot be indexed directly. Problem: Sequential scan (slow), poor performance. Solution: Generated columns + indexes. Example: Query products by brand. Without index: SELECT * FROM products WHERE attributes->>'$.brand' = 'Apple' (sequential scan, 1000ms for 100K rows). With generated column + index: ADD COLUMN brand AS (attributes->>'$.brand') STORED, CREATE INDEX idx_brand ON products(brand), SELECT * FROM products WHERE brand = 'Apple' (index scan, 10ms, 100x faster). Real-world: Shopify uses generated columns for indexing JSON product attributes. Lesson: Always use generated columns + indexes for JSON queries.

Use generated columns for indexing JSON. STORED for indexes (pre-computed), VIRTUAL for computed values. Create index on generated column.

High

Slow queries (sequential scan), poor performance on large tables

Over-indexing tables (creating indexes on every column)

Index only frequently queried columns: Analyze queries with EXPLAIN, create indexes on WHERE/JOIN columns

Too many indexes slow down writes. Problem: Every INSERT/UPDATE must update all indexes (slow), wasted storage. Solution: Index only frequently queried columns. Example: Table with 10 columns, 10 indexes. INSERT: Must update 10 indexes (slow, 100ms). Optimized: 3 indexes on frequently queried columns. INSERT: Update 3 indexes (fast, 10ms, 10x faster). Rule: Index columns used in WHERE, JOIN, ORDER BY. Avoid indexing rarely queried columns. Real-world: Twitter indexes only frequently queried columns (user_id, created_at), not all columns. Lesson: Index strategically, not excessively.

Index only frequently queried columns. Use EXPLAIN to analyze queries. Monitor index usage: SHOW INDEX FROM table. Remove unused indexes.

Medium

Slow writes (must update all indexes), wasted storage, poor performance