SQL Practice Logo

SQLPractice Online

MySQL: JSON Functions: Mistakes

Module: Database-Specific Features

Not creating generated columns and indexes for frequently queried JSON paths

Create generated column: ADD COLUMN col AS (json->>'$.path') STORED, CREATE INDEX ON col

JSON columns cannot be indexed directly. Problem: Sequential scan (slow), 1000ms for 100K rows. Solution: Generated columns + indexes. Example: Query users by theme. Without index: SELECT * FROM users WHERE preferences->>'$.theme' = 'dark' (sequential scan, 1000ms). With generated column + index: ADD COLUMN theme AS (preferences->>'$.theme') STORED, CREATE INDEX idx_theme ON users(theme), SELECT * FROM users WHERE theme = 'dark' (index scan, 10ms, 100x faster). Real-world: Shopify uses generated columns for product brand/category. Lesson: Always use generated columns + indexes for frequently queried paths.

Use generated columns for indexing JSON. STORED for indexes (pre-computed). Create index on generated column. Automatic updates when JSON changes.

Critical

Slow queries (sequential scan), 100x slower than indexed queries

Using TEXT column instead of JSON type for storing JSON data

Use JSON type: CREATE TABLE table (data JSON), validates on INSERT

TEXT column does not validate JSON or provide JSON functions. Problem: Invalid JSON accepted (data corruption), no JSON functions (must parse in application), text storage (slower). Solution: JSON type. Example: TEXT: INSERT INTO table (data) VALUES ('{invalid json}') (success, invalid data). JSON: INSERT INTO table (data) VALUES ('{invalid json}') (ERROR: Invalid JSON text). Benefits: Validation on INSERT, JSON functions (JSON_EXTRACT, JSON_SET), binary storage (faster). Real-world: Slack uses JSON type for user preferences (validation, functions). Lesson: Always use JSON type, not TEXT.

Use JSON type for JSON data. Validates on INSERT. Provides JSON functions. Binary storage (faster than TEXT).

High

No validation (invalid JSON accepted), no JSON functions, text storage (slower)

Deep JSON nesting (3+ levels) for frequently queried data

Flatten JSON structure or use normalized tables for frequently queried nested data

Deep nesting (3+ levels) slows down queries. Problem: Multiple extractions (slow), complex syntax (preferences->'$.a'->'$.b'->'$.c'), hard to index (need multiple generated columns). Example: Deep: {"user": {"profile": {"settings": {"theme": "dark"}}}}. Query: preferences->'$.user'->'$.profile'->'$.settings'->'$.theme' (slow, complex). Flattened: {"user_profile_settings_theme": "dark"}. Query: preferences->>'$.user_profile_settings_theme' (faster, simpler). Alternative: Normalized tables for frequently queried nested data. Real-world: Slack flattens JSON for frequently queried fields. Lesson: Keep nesting shallow (1-2 levels), flatten or normalize for frequently queried data.

Keep JSON nesting shallow (1-2 levels). Flatten structure for frequently queried fields. Use normalized tables for complex nested data.

Medium

Slow queries (multiple extractions), complex syntax, hard to index

Forgetting quotes in JSON_CONTAINS() value parameter

Use quotes: JSON_CONTAINS(col, '"value"', path), not JSON_CONTAINS(col, 'value', path)

JSON_CONTAINS() expects JSON value (with quotes for strings). Problem: JSON_CONTAINS(col, 'value', path) fails (value is not valid JSON). Solution: JSON_CONTAINS(col, '"value"', path) ("value" is valid JSON string). Example: Wrong: JSON_CONTAINS(preferences, 'dark', '$.theme') (fails). Correct: JSON_CONTAINS(preferences, '"dark"', '$.theme') (works). Alternative: Use ->> operator: preferences->>'$.theme' = 'dark' (no quotes needed). Real-world: Common mistake in Shopify product queries. Lesson: Use quotes in JSON_CONTAINS() or use ->> operator.

JSON_CONTAINS() expects JSON value. Use quotes for strings: '"value"'. Or use ->> operator: col->>'$.path' = 'value'.

Medium

Query returns no results or error, JSON_CONTAINS() expects JSON value