Database Migrations and Schema Management - Production Strategies for Zero-Downtime Deployments
Master database migrations with zero-downtime deployment patterns, schema versioning, rollback strategies, data migrations, and production best practices for PostgreSQL, MySQL, and MongoDB.
Database migrations are essential for evolving application schemas safely in production without downtime or data loss. This comprehensive guide covers migration strategies, schema versioning, rollback patterns, data transformations, and production deployment techniques used by companies managing databases with billions of records.
Why Database Migrations Matter
Schema Evolution: Applications evolve, requiring database schema changes - new tables, altered columns, added indexes.
Zero Downtime: Production deployments must maintain availability during schema changes, avoiding service interruptions.
Data Integrity: Migrations must preserve data accuracy and relationships while transforming schemas.
Rollback Safety: Failed deployments need safe rollback mechanisms without data loss.
Stripe processes 100M+ daily transactions while deploying schema changes multiple times per day using careful migration strategies that ensure zero data loss and continuous availability.
Migration Fundamentals
Migration File Structure
// migrations/20260222120000_add_user_preferences.ts
import { Knex } from 'knex';
export async function up(knex: Knex): Promise<void> {
await knex.schema.createTable('user_preferences', (table) => {
table.uuid('id').primary().defaultTo(knex.raw('gen_random_uuid()'));
table.uuid('user_id').notNullable().references('id').inTable('users').onDelete('CASCADE');
table.string('theme').defaultTo('light');
table.boolean('email_notifications').defaultTo(true);
table.jsonb('settings').defaultTo('{}');
table.timestamps(true, true);
table.index('user_id');
});
}
export async function down(knex: Knex): Promise<void> {
await knex.schema.dropTableIfExists('user_preferences');
}
Prisma Migrations
// prisma/schema.prisma
model User {
id String @id @default(uuid())
email String @unique
username String @unique
preferences UserPreferences?
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
}
model UserPreferences {
id String @id @default(uuid())
userId String @unique
user User @relation(fields: [userId], references: [id], onDelete: Cascade)
theme String @default("light")
emailNotifications Boolean @default(true)
settings Json @default("{}")
createdAt DateTime @default(now())
updatedAt DateTime @updatedAt
}
// Generate migration
// prisma migrate dev --name add_user_preferences
// Apply to production
// prisma migrate deploy
Zero-Downtime Migration Patterns
Expand-Contract Pattern
The safest approach for schema changes - expand schema first, migrate data, then contract old schema.
Phase 1: Expand - Add New Column
-- Migration 1: Add new column (nullable initially)
ALTER TABLE users ADD COLUMN full_name VARCHAR(255);
-- Create index concurrently (doesn't lock table)
CREATE INDEX CONCURRENTLY idx_users_full_name ON users(full_name);
Phase 2: Dual-Write - Update Application
// Application code writes to both old and new columns
async function updateUser(userId: string, firstName: string, lastName: string) {
await prisma.user.update({
where: { id: userId },
data: {
firstName, // Old column
lastName, // Old column
fullName: `${firstName} ${lastName}` // New column
}
});
}
Phase 3: Backfill - Migrate Existing Data
-- Batch update existing records (do in small batches to avoid locking)
DO $$
DECLARE
batch_size INTEGER := 1000;
offset_val INTEGER := 0;
BEGIN
LOOP
UPDATE users
SET full_name = CONCAT(first_name, ' ', last_name)
WHERE id IN (
SELECT id FROM users
WHERE full_name IS NULL
LIMIT batch_size
);
EXIT WHEN NOT FOUND;
-- Brief pause between batches to avoid overwhelming database
PERFORM pg_sleep(0.1);
END LOOP;
END $$;
Phase 4: Contract - Remove Old Columns
-- Migration 2: Make new column NOT NULL (after all data migrated)
ALTER TABLE users ALTER COLUMN full_name SET NOT NULL;
-- Migration 3: Drop old columns (in separate deployment)
ALTER TABLE users DROP COLUMN first_name;
ALTER TABLE users DROP COLUMN last_name;
Adding Columns Safely
-- ❌ Bad: Adding NOT NULL column without default locks table
ALTER TABLE users ADD COLUMN status VARCHAR(20) NOT NULL;
-- ✅ Good: Add nullable column first
ALTER TABLE users ADD COLUMN status VARCHAR(20);
-- ✅ Good: Set default for new rows
ALTER TABLE users ALTER COLUMN status SET DEFAULT 'active';
-- ✅ Good: Backfill existing rows in batches
UPDATE users SET status = 'active' WHERE status IS NULL;
-- ✅ Good: Add NOT NULL constraint after backfill
ALTER TABLE users ALTER COLUMN status SET NOT NULL;
Renaming Columns Without Downtime
-- Phase 1: Add new column
ALTER TABLE products ADD COLUMN price_cents INTEGER;
-- Phase 2: Backfill data
UPDATE products SET price_cents = (price * 100)::INTEGER WHERE price_cents IS NULL;
-- Phase 3: Application reads from both columns (for rollback safety)
-- Phase 4: Application writes to both columns
-- Phase 5: Drop old column (after confirming new column works)
ALTER TABLE products DROP COLUMN price;
Index Management
Creating Indexes Without Locking
-- ❌ Bad: Regular index creation locks table
CREATE INDEX idx_users_email ON users(email);
-- ✅ Good: Concurrent index creation (PostgreSQL)
CREATE INDEX CONCURRENTLY idx_users_email ON users(email);
-- Monitor index creation progress
SELECT
pid,
now() - pg_stat_activity.query_start AS duration,
query
FROM pg_stat_activity
WHERE query LIKE 'CREATE INDEX%';
Dropping Indexes Safely
-- ✅ Safe: Concurrent drop (PostgreSQL)
DROP INDEX CONCURRENTLY idx_old_index;
-- MySQL: Drop index (acquires metadata lock but typically quick)
DROP INDEX idx_old_index ON users;
Partial Indexes for Performance
// Prisma schema with partial index
model Order {
id String @id @default(uuid())
status String
userId String
createdAt DateTime @default(now())
@@index([userId, createdAt], where: status == 'pending')
}
// Raw SQL equivalent
CREATE INDEX idx_pending_orders
ON orders(user_id, created_at)
WHERE status = 'pending';
Data Migrations
Complex Data Transformations
// Migration to split name field
import { Knex } from 'knex';
export async function up(knex: Knex): Promise<void> {
// Add new columns
await knex.schema.table('users', (table) => {
table.string('first_name');
table.string('last_name');
});
// Migrate data in batches
const batchSize = 1000;
let offset = 0;
while (true) {
const users = await knex('users')
.select('id', 'name')
.whereNotNull('name')
.whereNull('first_name')
.limit(batchSize);
if (users.length === 0) break;
// Transform and update
const updates = users.map(user => {
const [firstName, ...lastNameParts] = user.name.split(' ');
const lastName = lastNameParts.join(' ');
return knex('users')
.where({ id: user.id })
.update({
first_name: firstName,
last_name: lastName || firstName
});
});
await Promise.all(updates);
console.log(`Migrated ${users.length} users (offset: ${offset})`);
offset += batchSize;
// Brief pause to avoid overloading database
await new Promise(resolve => setTimeout(resolve, 100));
}
// Make columns NOT NULL after backfill
await knex.schema.alterTable('users', (table) => {
table.string('first_name').notNullable().alter();
table.string('last_name').notNullable().alter();
});
}
export async function down(knex: Knex): Promise<void> {
// Restore name field
await knex.raw(UPDATE users SET name = CONCAT(first_name, ' ', last_name) WHERE name IS NULL);
await knex.schema.table('users', (table) => {
table.dropColumn('first_name');
table.dropColumn('last_name');
});
}
Handling JSON Transformations
// Migration to restructure JSON field
export async function up(knex: Knex): Promise<void> {
const batchSize = 500;
let processed = 0;
while (true) {
const records = await knex('products')
.select('id', 'metadata')
.whereRaw("metadata->>'version' IS NULL")
.limit(batchSize);
if (records.length === 0) break;
const updates = records.map(record => {
const oldMetadata = record.metadata;
const newMetadata = {
version: '2.0',
attributes: {
color: oldMetadata.color,
size: oldMetadata.size,
material: oldMetadata.material
},
legacy: oldMetadata
};
return knex('products')
.where({ id: record.id })
.update({ metadata: newMetadata });
});
await Promise.all(updates);
processed += records.length;
console.log(`Transformed ${processed} product metadata records`);
await new Promise(resolve => setTimeout(resolve, 50));
}
}
Migration Rollback Strategies
Safe Rollback Pattern
// migrations/20260222_add_user_settings.ts
export async function up(knex: Knex): Promise<void> {
// Forward migration
await knex.schema.createTable('user_settings', (table) => {
table.uuid('id').primary();
table.uuid('user_id').references('users.id');
table.jsonb('preferences');
table.timestamps(true, true);
});
}
export async function down(knex: Knex): Promise<void> {
// Rollback - but preserve data for recovery
await knex.schema.renameTable('user_settings', 'user_settings_backup');
// Log rollback for auditing
console.log('Rolled back user_settings - table renamed to user_settings_backup');
// Actual drop can be done later after confirming rollback success
// await knex.schema.dropTable('user_settings_backup');
}
Testing Migrations Locally
# Test migration up
npm run migrate:up
Verify data
psql -d myapp_dev -c "SELECT COUNT(*) FROM user_preferences;"
Test rollback
npm run migrate:down
Verify rollback
psql -d myapp_dev -c "SELECT COUNT(*) FROM user_preferences;" # Should fail
Re-apply migration
npm run migrate:up
Production Deployment Workflow
Pre-Deployment Checklist
#!/bin/bash
# pre-deploy-check.sh
echo "🔍 Pre-deployment migration check..."
1. Validate migration files
echo "Validating migration syntax..."
npm run migrate:validate
2. Check migration status
echo "Checking pending migrations..."
npm run migrate:status
3. Estimate migration time (run on production snapshot)
echo "Testing migration on production snapshot..."
pg_dump production | psql staging
npm run migrate:up --env=staging
Monitor duration
4. Check for breaking changes
echo "Checking for backwards compatibility..."
npm run migrate:check-compatibility
5. Verify rollback works
echo "Testing rollback..."
npm run migrate:down --env=staging
npm run migrate:up --env=staging
echo "✅ Pre-deployment checks passed!"
Deployment Sequence
#!/bin/bash
# deploy-with-migrations.sh
set -e # Exit on error
echo "🚀 Starting deployment..."
1. Run migrations
echo "Running database migrations..."
npm run migrate:deploy
2. Wait for migration completion
echo "Waiting for migrations to complete..."
npm run migrate:status
3. Deploy application code
echo "Deploying application..."
kubectl set image deployment/app app=myapp:v2.0.0
4. Wait for rollout
echo "Waiting for rollout..."
kubectl rollout status deployment/app
5. Run post-deployment validation
echo "Running validation..."
npm run validate:deployment
echo "✅ Deployment complete!"
Schema Versioning
Tracking Schema Versions
-- migrations_history table (automatically managed by migration tools)
CREATE TABLE schema_migrations (
version VARCHAR(255) PRIMARY KEY,
applied_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
execution_time_ms INTEGER,
success BOOLEAN DEFAULT TRUE,
error_message TEXT
);
-- Check current schema version
SELECT version, applied_at
FROM schema_migrations
ORDER BY applied_at DESC
LIMIT 1;
Migration Naming Convention
YYYYMMDDHHMMSS_descriptive_name.{ts,sql}
Examples:
20260222120000_create_users_table.ts
20260222120100_add_email_index_users.ts
20260222120200_migrate_user_preferences.ts
MongoDB Schema Migrations
Mongoose Schema Evolution
// models/User.ts
import mongoose from 'mongoose';
const userSchema = new mongoose.Schema({
email: { type: String, required: true, unique: true },
profile: {
firstName: String,
lastName: String,
bio: String,
// New field (optional for backwards compatibility)
avatar: { type: String, default: null }
},
// Schema version for tracking
schemaVersion: { type: Number, default: 2 }
}, {
timestamps: true
});
export const User = mongoose.model('User', userSchema);
MongoDB Data Migration Script
// migrations/mongo/20260222_add_avatar_field.ts
import { MongoClient } from 'mongodb';
export async function up(client: MongoClient) {
const db = client.db();
const users = db.collection('users');
let processed = 0;
const batchSize = 1000;
const cursor = users.find({ avatar: { $exists: false } });
while (await cursor.hasNext()) {
const batch = [];
for (let i = 0; i < batchSize && await cursor.hasNext(); i++) {
const user = await cursor.next();
batch.push({
updateOne: {
filter: { _id: user._id },
update: {
$set: {
'profile.avatar': null,
schemaVersion: 2
}
}
}
});
}
if (batch.length > 0) {
await users.bulkWrite(batch);
processed += batch.length;
console.log(`Processed ${processed} users`);
}
}
console.log(Migration complete. Updated ${processed} users.);
}
export async function down(client: MongoClient) {
const db = client.db();
const users = db.collection('users');
await users.updateMany(
{ schemaVersion: 2 },
{
$unset: { 'profile.avatar': '' },
$set: { schemaVersion: 1 }
}
);
}
Real-World Examples
Stripe's Migration Strategy
Stripe deploys schema changes multiple times daily:
- Dual-write pattern: Write to both old and new schemas during transition
- Shadow mode: Test new schema with production data before cutover
- Gradual rollout: Apply migrations to subset of shards first
- Automated rollback: Automatic rollback if error rate exceeds threshold
Their approach enables zero-downtime migrations across billions of payment records.
GitHub's Large Table Migrations
GitHub migrates tables with billions of rows:
- Online schema change tools: gh-ost and pt-online-schema-change
- Chunk-based migrations: Process 1000 rows at a time
- Replica lag monitoring: Pause if replication falls behind
- Percentage-based rollout: Migrate 1%, 10%, 50%, 100%
Average migration time for billion-row table: 8-12 hours with zero downtime.
Netflix's Multi-Region Migrations
Netflix deploys schema changes across regions:
- Region-by-region rollout: Deploy to one AWS region, validate, then next
- Canary analysis: Compare error rates between regions
- Automated testing: Run full test suite against new schema
- Instant rollback: Prepared rollback scripts for each migration
Their strategy ensures global availability during schema evolution.
Conclusion
Successful database migrations require careful planning, backwards-compatible changes, and gradual rollout strategies. Use expand-contract pattern for complex schema changes, create indexes concurrently to avoid locks, and batch data migrations to prevent overwhelming production databases.
Key patterns - dual-write during transitions, backfill data in small batches, test rollbacks before deploying, and monitor migration progress closely - enable safe schema evolution in production. Always maintain backwards compatibility during migrations, allowing applications to run against both old and new schemas simultaneously.
Deploy migrations separately from application code when possible, test thoroughly on production-like environments, and prepare rollback procedures before every deployment. Database migrations are high-risk operations - methodical execution prevents data loss and downtime.
Related Articles
GraphQL API Design - Production Architecture and Best Practices for Scalable Systems
Master GraphQL API design covering schema design principles, resolver optimization, N+1 query prevention with DataLoader, authentication and authorization patterns, caching strategies, error handling, and production deployment for high-performance GraphQL systems.
Testing Strategies - Unit, Integration, and E2E Testing Best Practices for Production Quality
Comprehensive guide to testing strategies covering unit tests, integration tests, end-to-end testing, test-driven development, mocking patterns, testing pyramid, and production testing practices for reliable software delivery.
Monitoring and Observability - Production Systems Performance and Debugging at Scale
Master monitoring and observability covering metrics collection with Prometheus, distributed tracing with OpenTelemetry, log aggregation, alerting strategies, SLOs/SLIs, and production debugging techniques for reliable systems.
Written by StaticBlock Editorial
StaticBlock Editorial is a technical writer and software engineer specializing in web development, performance optimization, and developer tooling.