Monday, May 4, 2026

I Was 4 Hours Late for a Production Release Because of This One NestJS Migration Error on a VPS – How I Fixed It in 30 Minutes

I Was 4 Hours Late for a Production Release Because of This One NestJS Migration Error on a VPS – How I Fixed It in 30 Minutes

It was 2 AM, my coffee was gone, and the production server on my DigitalOcean VPS started screaming “Migration failed”. I stared at the log for what felt like eternity, watched the clock tick, and realized I’d miss the release window by four hours. The culprit? A single, easily‑overlooked NestJS migration error.

In this post I’ll walk you through exactly what went wrong, how I diagnosed it, and—most importantly—how I rolled back the fix in under 30 minutes so the release could ship on schedule. If you run NestJS on a VPS, this story will save you hours (or days) of panic.

Why This Matters

Production migrations are a make‑or‑break moment for any SaaS or API service. A single typo can bring down a microservice, cause data loss, or—like in my case—delay a release that’s tied to marketing spend, partner SLAs, and customer expectations.

When you’re on a VPS with limited monitoring tools, that error can stay hidden until it’s too late. Knowing how to pinpoint and remediate a NestJS migration error quickly is a must‑have skill for modern Node.js devs.

Step‑by‑Step Tutorial: Fix the Migration in 30 Minutes

  1. Reproduce the error locally

    First, clone the exact code that’s running on the VPS. git clone the release tag, install dependencies, and run the migration script with the same NODE_ENV=production flag.

    git clone -b v1.3.2 https://github.com/yourorg/api.git
    cd api
    npm ci
    NODE_ENV=production npm run migration:run
    Tip: Use npm ci instead of npm install to guarantee identical package versions.
  2. Read the stack trace

    The log showed:

    QueryFailedError: column "user_id" of relation "orders" does not exist
        at QueryRunner. (/node_modules/typeorm/...)

    The error points to a missing column that should have been added in the previous migration.

  3. Check migration order

    In src/migrations I had two files:

    • 1630459200000-CreateUserTable.ts
    • 1630462800000-AddUserIdToOrders.ts

    The timestamps looked correct, but the compiled JavaScript files in dist/migrations were out of sync because the last npm run build failed midway.

  4. Re‑build the migration files

    Run a clean build and verify the output order.

    npm run clean
    npm run build
    ls dist/migrations

    The output now shows the correct chronological order.

  5. Rollback the broken migration on the VPS

    Connect to the VPS (SSH) and run the TypeORM CLI in --revert mode:

    ssh root@vps.example.com
    cd /var/www/api
    npm ci
    NODE_ENV=production npx typeorm migration:revert
    Warning: Only revert the migration that failed; don't roll back everything unless you're sure.
  6. Run the migration again

    Now that the compiled files are in the right order, re‑run the migration:

    NODE_ENV=production npx typeorm migration:run

    Success! QuerySucceeded appears for all pending migrations.

  7. Restart the NestJS app

    If you’re using PM2, simply:

    pm2 restart api

    The API is back online, and the release pipeline can continue.

Real‑World Use Case: Deploying a Payment Service

Our team was rolling out a new Stripe integration. The migration that added stripe_customer_id to the users table was supposed to run after the users table was created. Because the compiled migration order was wrong, the app attempted to add the column to a non‑existent table, and the VPS threw the error you just saw.

Fixing the order restored the DB schema, and the payment flow worked for all live customers within the same hour.

Results / Outcome

  • Production downtime reduced from 4 hours to 0 after the fix.
  • Release shipped on schedule, preserving a $12,500 marketing spend.
  • Team gained a documented checklist for future migrations (see Bonus Tips).

Bonus Tips

  • Automate migration verification. Add a CI job that runs npm run migration:run against a fresh Postgres container after each PR.
  • Version your migrations. Prefix files with a zero‑padded incremental number (001‑CreateUser, 002‑AddOrder) in addition to timestamps.
  • Never commit compiled dist/ files. Let the server build them on deploy to avoid mismatched timestamps.
  • Use a health‑check endpoint. Return DB_CONNECTED && MIGRATIONS_UP_TO_DATE so alerts fire before users notice.
Pro Tip: If you use Docker, add RUN npm run build && npm run migration:run in your Dockerfile so the container fails fast on bad migrations.

Monetization (Optional)

If you run a SaaS and want to avoid costly release delays, consider subscribing to our 24/7 NestJS monitoring service. We’ll automatically catch migration errors, alert you via Slack, and even roll back the offending script for you.

Written by Jane Doe, Senior Backend Engineer & Cloud Automation Consultant. Follow me on Twitter for more real‑world dev‑ops stories.

No comments:

Post a Comment