Database Migration Strategies: Real-World
Module: Database-Specific Features
Database migrations are common and critical: (1) Technology upgrade: MySQL 5.7 → 8.0 for better performance (window functions, CTEs). (2) Database switch: MySQL → PostgreSQL for advanced features (MVCC, arrays, JSONB, full-text search). (3) Cloud migration: On-premise Oracle → AWS Aurora PostgreSQL for cost savings ($100K/year → $10K/year). (4) Vendor escape: Oracle → PostgreSQL to avoid licensing costs. Real examples: GitLab migrated from MySQL to PostgreSQL (2017) - 6 months, 100K+ installations, 2-3x performance improvement. Uber migrated ride matching from MySQL to PostgreSQL - MVCC enabled non-blocking reads (50K writes/sec). Stack Overflow upgraded SQL Server with zero downtime - blue-green deployment. Stripe migrated from MongoDB to PostgreSQL for ACID guarantees. Trade-offs: Risk (data loss, downtime), cost (engineering time, tools), complexity (schema conversion, testing).
GitLab: MySQL to PostgreSQL Migration - 6 Months, 100K+ Installations
GitLab migrated from MySQL to PostgreSQL (2017) to leverage advanced features (MVCC, arrays, JSONB, full-text search). Challenge: 100K+ self-hosted installations, can't force immediate upgrade, must support both databases during transition. Strategy: Parallel run with dual-write for 3 months, gradual cutover, extensive testing. Result: 2-3x performance improvement, better concurrency, advanced features enabled.
GitLab migration strategy: (1) Assessment (3 months): Analyze schema compatibility, identify 100+ incompatibilities (AUTO_INCREMENT, ENUM, stored procedures), estimate effort. (2) Schema conversion (2 months): Use pgloader for automated conversion, manual fixes for edge cases, create migration scripts. (3) Dual-write implementation (2 months): Application writes to both MySQL and PostgreSQL, reads from MySQL, validate PostgreSQL data matches. (4) Testing (2 months): Test on production-like data (100GB dump), performance testing, load testing, data validation. (5) Gradual cutover (3 months): Switch reads gradually (1% → 10% → 50% → 100%), monitor for issues, rollback if needed. (6) Deprecation (ongoing): Recommend PostgreSQL for new installations, support MySQL for legacy customers, plan to drop MySQL support eventually. Total time: 6 months planning + 3 months execution = 9 months.
-- GitLab migration example (simplified)
-- Phase 1: Schema conversion
-- MySQL schema
CREATE TABLE issues (
id INT AUTO_INCREMENT PRIMARY KEY,
project_id INT NOT NULL,
title VARCHAR(255) NOT NULL,
description TEXT,
state ENUM('opened', 'closed') DEFAULT 'opened',
created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
updated_at DATETIME DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
INDEX idx_project (project_id),
INDEX idx_state (state)
) ENGINE=InnoDB;
-- PostgreSQL schema (converted)
CREATE TABLE issues (
id SERIAL PRIMARY KEY,
project_id INTEGER NOT NULL,
title VARCHAR(255) NOT NULL,
description TEXT,
state VARCHAR(20) DEFAULT 'opened',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT chk_state CHECK (state IN ('opened', 'closed'))
);
CREATE INDEX idx_project ON issues(project_id);
CREATE INDEX idx_state ON issues(state);
-- Trigger for updated_at (replaces ON UPDATE CURRENT_TIMESTAMP)
CREATE TRIGGER update_issues_updated_at
BEFORE UPDATE ON issues
FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();
-- Phase 2: Dual-write implementation (Ruby on Rails)
# Application code (simplified)
class Issue < ApplicationRecord
# Write to both databases
after_create :write_to_postgresql