ANSI SQL Standards & Database Differences: Interview
Module: Foundational Concepts
What is the SQL standard and what is the practical portability baseline today?
The SQL standard (ISO/IEC 9075) is the international specification for how SQL must behave. It has evolved through major versions: SQL-86, SQL-92, SQL:1999, SQL:2003, SQL:2008, SQL:2011, SQL:2016, SQL:2023.
The practical portability baseline is SQL-92. Every major production database (PostgreSQL, MySQL, SQL Server, Oracle, SQLite, Snowflake, BigQuery) supports SQL-92 core. This includes: SELECT/JOIN/WHERE/GROUP BY/HAVING, CASE expressions, CAST(), COALESCE(), NULLIF(), CURRENT_DATE, CURRENT_TIMESTAMP, UNION/INTERSECT/EXCEPT, standard string functions (UPPER, LOWER, SUBSTRING, TRIM).
SQL:2003 features (window functions, GENERATED AS IDENTITY) have broad but not universal support — MySQL 5.7 being the most notable holdout for window functions.
In practice: write SQL-92 for 80% of your queries and isolate vendor-specific syntax in a thin data access layer.
Name three SQL features that look the same across databases but behave differently.
1. String concatenation with ||: PostgreSQL and Oracle treat || as the ANSI concatenation operator. MySQL treats || as logical OR by default and returns 0/1. SQL Server doesn't support || at all (uses +). Looks like it works in dev (PostgreSQL) and silently breaks in prod (MySQL).
2. REPLACE INTO in MySQL: Looks like an upsert but is actually DELETE + INSERT. It fires DELETE triggers, cascades to foreign key children, and advances auto-increment sequences. True upserts use ON DUPLICATE KEY UPDATE (MySQL) or ON CONFLICT (PostgreSQL).
3. BOOLEAN type: PostgreSQL has a real BOOLEAN type with TRUE/FALSE literals. SQL Server has BIT with only 1/0 — TRUE/FALSE are syntax errors. MySQL accepts TRUE/FALSE but stores them as TINYINT(1). Code like WHERE is_active = TRUE works on PostgreSQL and MySQL but crashes on SQL Server.
How would you design a data access layer for a SaaS product that must support both PostgreSQL and MySQL?
Three-layer approach:
1. SQL-92 core layer (shared, ~80% of queries): All SELECT queries, JOINs, aggregations, CASE expressions, COALESCE, NULLIF, CURRENT_TIMESTAMP, window functions (assuming MySQL 8.0+). These run unchanged on both databases.
2. Thin adapter layer (per-database, ~20% of queries): Encapsulate the divergence points in named functions:
- paginate(query, limit, offset) → LIMIT n OFFSET m (MySQL) vs FETCH FIRST n ROWS ONLY (PostgreSQL)
- upsert(table, data, conflict_column) → ON DUPLICATE KEY UPDATE vs ON CONFLICT DO UPDATE
- concat(col1, col2) → already unified via CONCAT()
- current_date() → CURRENT_DATE works on both
3. Schema migration tool (Flyway, Liquibase, Alembic): Maintains database-specific DDL files for CREATE TABLE statements that use vendor syntax (AUTO_INCREMENT vs GENERATED AS IDENTITY).
Testing: Run your full test suite against both databases in CI. Schema differences often surface only at migration time.
What is the difference between REPLACE INTO and a true upsert in MySQL?
REPLACE INTO is NOT an upsert. It is a DELETE followed by an INSERT. The critical behavioral differences:
1. Triggers: Both DELETE and INSERT triggers fire. If you have audit log triggers, you get a spurious delete event in your audit trail.
2. Foreign keys: If any child table has a CASCADE DELETE foreign key to your table, REPLACE INTO will silently delete all child rows before reinserting the parent. This is data loss.
3. Auto-increment: The sequence advances with every REPLACE even when the row already exists. After 1,000 REPLACE INTO calls on 1 row, your auto-increment counter is at 1,001.
4. Race conditions: Two concurrent REPLACE INTO calls can cause deadlocks that a true upsert (ON DUPLICATE KEY UPDATE) handles gracefully.
True upsert in MySQL is INSERT ... ON DUPLICATE KEY UPDATE. It performs an atomic check-and-update at the row level without deleting. In PostgreSQL, INSERT ... ON CONFLICT DO UPDATE is even cleaner. Both are safe for concurrent access.