MySQL: JSON Functions: Mistakes
Module: Database-Specific Features
Not creating generated columns and indexes for frequently queried JSON paths
Create generated column: ADD COLUMN col AS (json->>'$.path') STORED, CREATE INDEX ON col
JSON columns cannot be indexed directly. Problem: Sequential scan (slow), 1000ms for 100K rows. Solution: Generated columns + indexes. Example: Query users by theme. Without index: SELECT * FROM users WHERE preferences->>'$.theme' = 'dark' (sequential scan, 1000ms). With generated column + index: ADD COLUMN theme AS (preferences->>'$.theme') STORED, CREATE INDEX idx_theme ON users(theme), SELECT * FROM users WHERE theme = 'dark' (index scan, 10ms, 100x faster). Real-world: Shopify uses generated columns for product brand/category. Lesson: Always use generated columns + indexes for frequently queried paths.
Use generated columns for indexing JSON. STORED for indexes (pre-computed). Create index on generated column. Automatic updates when JSON changes.
Critical
Slow queries (sequential scan), 100x slower than indexed queries
Using TEXT column instead of JSON type for storing JSON data
Use JSON type: CREATE TABLE table (data JSON), validates on INSERT
TEXT column does not validate JSON or provide JSON functions. Problem: Invalid JSON accepted (data corruption), no JSON functions (must parse in application), text storage (slower). Solution: JSON type. Example: TEXT: INSERT INTO table (data) VALUES ('{invalid json}') (success, invalid data). JSON: INSERT INTO table (data) VALUES ('{invalid json}') (ERROR: Invalid JSON text). Benefits: Validation on INSERT, JSON functions (JSON_EXTRACT, JSON_SET), binary storage (faster). Real-world: Slack uses JSON type for user preferences (validation, functions). Lesson: Always use JSON type, not TEXT.
Use JSON type for JSON data. Validates on INSERT. Provides JSON functions. Binary storage (faster than TEXT).
High
No validation (invalid JSON accepted), no JSON functions, text storage (slower)
Deep JSON nesting (3+ levels) for frequently queried data
Flatten JSON structure or use normalized tables for frequently queried nested data
Deep nesting (3+ levels) slows down queries. Problem: Multiple extractions (slow), complex syntax (preferences->'$.a'->'$.b'->'$.c'), hard to index (need multiple generated columns). Example: Deep: {"user": {"profile": {"settings": {"theme": "dark"}}}}. Query: preferences->'$.user'->'$.profile'->'$.settings'->'$.theme' (slow, complex). Flattened: {"user_profile_settings_theme": "dark"}. Query: preferences->>'$.user_profile_settings_theme' (faster, simpler). Alternative: Normalized tables for frequently queried nested data. Real-world: Slack flattens JSON for frequently queried fields. Lesson: Keep nesting shallow (1-2 levels), flatten or normalize for frequently queried data.
Keep JSON nesting shallow (1-2 levels). Flatten structure for frequently queried fields. Use normalized tables for complex nested data.
Medium
Slow queries (multiple extractions), complex syntax, hard to index
Forgetting quotes in JSON_CONTAINS() value parameter
Use quotes: JSON_CONTAINS(col, '"value"', path), not JSON_CONTAINS(col, 'value', path)
JSON_CONTAINS() expects JSON value (with quotes for strings). Problem: JSON_CONTAINS(col, 'value', path) fails (value is not valid JSON). Solution: JSON_CONTAINS(col, '"value"', path) ("value" is valid JSON string). Example: Wrong: JSON_CONTAINS(preferences, 'dark', '$.theme') (fails). Correct: JSON_CONTAINS(preferences, '"dark"', '$.theme') (works). Alternative: Use ->> operator: preferences->>'$.theme' = 'dark' (no quotes needed). Real-world: Common mistake in Shopify product queries. Lesson: Use quotes in JSON_CONTAINS() or use ->> operator.
JSON_CONTAINS() expects JSON value. Use quotes for strings: '"value"'. Or use ->> operator: col->>'$.path' = 'value'.
Medium
Query returns no results or error, JSON_CONTAINS() expects JSON value