MySQL Features Deep Dive: Interview

Module: Database-Specific Features

Explain the difference between InnoDB and MyISAM storage engines. When would you use each?

InnoDB vs MyISAM difference is transactions and locking. InnoDB: ACID compliance (transactions with COMMIT/ROLLBACK), foreign keys (referential integrity), row-level locking (better concurrency), crash recovery (automatic). MyISAM: No transactions (cannot rollback), no foreign keys, table-level locking (poor concurrency), faster reads (no transaction overhead). Use InnoDB when: (1) Transactional data (orders, payments, users), (2) Need foreign keys (referential integrity), (3) High concurrency (multiple writes), (4) Need crash recovery. Use MyISAM when: (1) Read-only data (logs, archives), (2) No transactions needed, (3) Single-threaded writes, (4) Legacy systems (rare). Performance: InnoDB: Row-level locking (multiple writes simultaneously). MyISAM: Table-level locking (one write at a time). Real-world: WordPress uses InnoDB for all tables (posts, comments, users). MyISAM is legacy (deprecated). Lesson: Always use InnoDB (default since MySQL 5.5).

How does MySQL replication work? What are the benefits and trade-offs?

MySQL replication copies data from master to slaves for read scaling. How it works: (1) Master writes to binary log (binlog), (2) Slave reads binlog (I/O thread), (3) Slave applies changes (SQL thread), (4) Asynchronous (slave may lag). Replication types: Asynchronous (default, fast, eventual consistency), Semi-synchronous (wait for 1 slave, slower, better consistency), GTID (Global Transaction ID, easier failover). Benefits: (1) Read scaling (1 master, N slaves), (2) High availability (failover to slave), (3) Backup (backup from slave, no impact on master), (4) Geographic distribution (slaves in different regions). Trade-offs: (1) Replication lag (slave behind master, eventual consistency), (2) Write scaling limited (single master), (3) Complexity (setup, monitoring, failover). Real-world: Facebook uses 1 master + 100+ slaves for read scaling. YouTube uses replication for billions of queries/day. Lesson: Use replication for read-heavy workloads (90% reads).

When would you use JSON columns vs normalized tables? What are the trade-offs?

JSON vs normalized tables decision depends on schema flexibility needs. Use JSON when: (1) Flexible schema (attributes vary by category), (2) Rapid development (no ALTER TABLE), (3) Semi-structured data (API responses), (4) Read-heavy workloads. Use normalized tables when: (1) Fixed schema (well-defined structure), (2) Complex queries (JOINs, aggregations), (3) Write-heavy workloads, (4) Referential integrity needed. Trade-offs: JSON: Flexible (no ALTER TABLE), but limited queries (no JOIN on nested data), requires generated columns for indexing. Normalized: Fixed schema (requires ALTER TABLE), but flexible queries (JOIN, aggregate), direct indexing. Performance: JSON: Fast reads (no JOIN), slow writes (update entire JSON). Normalized: Slower reads (JOIN), fast writes (update single column). Real-world: Shopify uses JSON for product attributes (flexible schema). Uses normalized tables for orders (fixed schema, complex queries). Lesson: JSON for flexible schemas, normalized for fixed schemas.

Design a product catalog schema using JSON columns for flexible attributes. Include generated columns and indexes for fast queries.

-- Design product catalog with JSON attributes

CREATE TABLE products (

product_id INT AUTO_INCREMENT PRIMARY KEY,

name VARCHAR(200) NOT NULL,

category VARCHAR(50) NOT NULL,

price DECIMAL(10, 2) NOT NULL,

-- JSON column for flexible attributes

attributes JSON,

created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,

updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,

-- Indexes

INDEX idx_category (category),

INDEX idx_price (price)

) ENGINE=InnoDB;

-- Add generated columns for frequently queried JSON fields

ALTER TABLE products

ADD COLUMN brand VARCHAR(50) AS (attributes->>'$.brand') STORED,

ADD COLUMN ram VARCHAR(20) AS (attributes->>'$.ram') STORED;

-- Create indexes on generated columns

CREATE INDEX idx_products_brand ON products(brand);

CREATE INDEX idx_products_ram ON products(ram);

-- Insert laptop

INSERT INTO products (name, category, price, attributes) VALUES

(

'MacBook Pro 16"',

'Laptops',

2499.00,

JSON_OBJECT(

'brand', 'Apple',

'ram', '16GB',

'storage', '512GB SSD',

'processor', 'M3 Pro',

'screen_size', '16 inch',