🚀 Executive Summary

TL;DR: Migrating a 300GB PostgreSQL database from Heroku to AWS with minimal downtime is challenging due to data volume, downtime constraints, and platform limitations. This guide outlines three strategies: a high-downtime full dump/restore, a minimal-downtime logical replication (preferred), and a ‘clever middle ground’ using external tools like Bucardo when native replication isn’t feasible.

🎯 Key Takeaways

  • Migrating large Heroku PostgreSQL databases to AWS is complex due to data volume, acceptable downtime, and Heroku’s platform limitations.
  • The ‘Grit Your Teeth’ Full Dump & Restore method is simple but causes massive downtime, potentially 6-12+ hours for a 300GB database.
  • Logical Replication is the ‘Gold Standard’ for minimal downtime (minutes) but requires a Heroku Postgres plan with sufficient privileges to create publications.
  • External tools like Bucardo provide a ‘Clever Middle Ground’ for low downtime when native logical replication is blocked, though they introduce management complexity and performance overhead from triggers.
  • Regardless of the chosen method, always perform a dry run in a staging environment to validate the migration process and estimate downtime.

Migrating a 300GB PostgreSQL database from Heroku to AWS with minimal downtime

Struggling with migrating a large Heroku PostgreSQL database to AWS with minimal downtime? Learn three battle-tested strategies, from the quick-and-dirty dump/restore to the robust logical replication method, directly from a senior DevOps lead.

Heroku to AWS: A Senior Engineer’s Guide to Migrating a 300GB PostgreSQL DB Without Losing Your Mind

I still get a cold sweat thinking about it. It was 2 AM, the migration window was supposed to be two hours, and the `pg_restore` command was staring back at me, its progress bar barely inching forward. We were migrating a ‘mere’ 100GB database, and what was benchmarked as a 90-minute job was now on its fourth hour. The VP of Engineering was pinging me on Slack every five minutes. That night, I learned a hard lesson: for a database of any significant size, “just dump and restore” is a recipe for a career-limiting event. Migrating a large, active database isn’t just a technical problem; it’s a business continuity problem.

The “Why”: What Makes This So Hard?

Let’s be real. The core issue isn’t getting the data from point A to point B. The challenge is doing it while the application is still running, without losing data, and without making your users (or your boss) angry. With a 300GB database, the sheer volume of data makes the simplest approach—a full backup and restore—untenable due to the hours of downtime it would cause. Heroku’s platform, while fantastic for getting started, also abstracts away some of the low-level controls you need for more advanced replication strategies, which adds another layer of complexity. You’re fighting three things: data volume, downtime, and platform limitations.

So, when a junior engineer on my team recently came to me with this exact Reddit thread, I walked him through the options we’ve used in the trenches. Here are the three main paths you can take, from the easiest (and scariest) to the most robust.

Solution 1: The “Grit Your Teeth” Full Dump & Restore

This is the classic approach. It’s simple, requires no fancy tools, and is conceptually easy to understand. It is also the one most likely to get you in trouble.

The Plan:

  1. Put your application into maintenance mode. (Downtime begins).
  2. Use Heroku’s CLI to trigger a fresh, consistent backup.
  3. Download that backup file.
  4. Use `pg_restore` to load the data into your shiny new AWS RDS instance.
  5. Update your application’s `DATABASE_URL` to point to the new RDS instance.
  6. Take the application out of maintenance mode. (Downtime ends).

The commands look something like this:

# 1. Put app in maintenance mode
heroku maintenance:on --app your-heroku-app

# 2. Capture a fresh backup on Heroku
heroku pg:backups:capture --app your-heroku-app

# 3. Download the backup file
heroku pg:backups:download --app your-heroku-app

# 4. Restore to your AWS RDS instance (this is the long part)
pg_restore --verbose --clean --no-acl --no-owner -h your-rds-endpoint.amazonaws.com -U your-db-user -d your-db-name latest.dump

# 5. Point your app to the new DB
heroku config:set DATABASE_URL="postgres://user:pass@your-rds-endpoint:5432/your-db-name"

# 6. Turn off maintenance mode
heroku maintenance:off --app your-heroku-app
Pros Cons
  • Conceptually simple.
  • Requires minimal setup.
  • Guarantees data consistency.
  • MASSIVE DOWNTIME. For 300GB, this could be 6-12+ hours.
  • Any error during the restore process extends the downtime significantly.
  • Not a viable option for most production applications.

Darian’s Take: I only recommend this for non-critical internal apps, staging environments, or if you can genuinely afford a full day of downtime. For anything customer-facing, avoid this method like the plague.

Solution 2: The “Gold Standard” Logical Replication

This is how you do it like a pro. The idea is to turn your new AWS RDS database into a live “follower” of your Heroku database. Data streams from the source to the destination in near real-time. Once they are in sync, you perform a quick cutover.

The Plan:

  1. Provision your AWS RDS for PostgreSQL instance. Make sure to enable logical replication by setting the `rds.logical_replication` parameter to `1` in its parameter group.
  2. On your Heroku database, create a `PUBLICATION`. This tells Postgres which tables you want to replicate. You’ll typically want to replicate all of them.
  3. On your AWS RDS instance, create a `SUBSCRIPTION` that points to the Heroku database credentials. This starts the initial data copy and then begins streaming live changes.
  4. Wait for the initial data sync to complete and for the replication lag to drop to near-zero.
  5. Perform the final cutover: put the app in maintenance mode, wait for any final transactions to replicate, stop the old database writes, update the `DATABASE_URL`, and turn maintenance mode off. The actual downtime is just minutes.

The key SQL commands look like this:

-- ON HEROKU (Source DB)
-- You may need to create a dedicated replication user first
CREATE PUBLICATION my_app_publication FOR ALL TABLES;

-- ON AWS RDS (Target DB)
-- This assumes you have already loaded the initial schema
CREATE SUBSCRIPTION my_app_subscription
    CONNECTION 'postgres://user:pass@heroku-db-host:5432/dbname?sslmode=require'
    PUBLICATION my_app_publication;
Pros Cons
  • Minimal downtime (minutes for the final cutover).
  • Safe and reliable, as it’s a native PostgreSQL feature.
  • Allows for thorough testing of the new DB before the cutover.
  • Requires a Heroku Postgres plan that grants you sufficient privileges to create a publication (typically Premium tier or higher). This is the biggest blocker.
  • More complex to configure than a simple dump/restore.
  • You must ensure network connectivity between AWS and Heroku.

Warning: Before you even think about this path, check your Heroku plan. If you don’t have the permissions to create a replication user and a publication, this method is a non-starter. Don’t promise your boss “zero downtime” until you’ve confirmed you can run `CREATE PUBLICATION`.

Solution 3: The “Clever Middle Ground” with an External Tool (like Bucardo)

What if you can’t use logical replication but still can’t afford the downtime of a full dump? This is where you get creative. Tools like Bucardo or AWS DMS (Database Migration Service) can sometimes bridge the gap. Bucardo, for example, is a trigger-based replication system.

The Plan:

  1. Set up an EC2 instance that can connect to both your Heroku DB and your new AWS RDS instance. This instance will run the Bucardo daemon.
  2. Install and configure Bucardo on the EC2 instance. This involves defining your source (“herd”) and target databases.
  3. Bucardo adds triggers to your source tables on Heroku to track changes (`INSERT`, `UPDATE`, `DELETE`).
  4. You perform an initial data copy.
  5. Bucardo then replays the changes captured by the triggers onto the target AWS RDS database.
  6. Once the databases are in sync, you perform a quick cutover similar to the logical replication method.
# Example of a bucardo command on the EC2 instance
# This is a conceptual example, the setup is more involved

bucardo add sync my_heroku_sync \
    dbs=heroku_source,rds_target \
    tables=all \
    type=pushdelta

bucardo start
Pros Cons
  • Achieves low downtime without needing superuser privileges on the source.
  • A good fallback when native replication isn’t an option.
  • More flexible than built-in tools.
  • Adds complexity: you now have another server (the EC2 instance) to manage.
  • Trigger-based replication adds performance overhead to your source Heroku database. You MUST test this.
  • Can be “fiddly” to set up and monitor. It’s not a fire-and-forget solution.

Darian’s Take: This is my “break glass in case of emergency” option. It’s a bit hacky, and the performance overhead from the triggers is real. But I’ve used it successfully to get a client out of a tight spot when their Heroku plan was too restrictive for logical replication. It saved the day, but we decommissioned the Bucardo server the moment the migration was complete.

Final Thoughts

There’s no single “best” way; there’s only the best way for your specific situation. Your choice depends on your tolerance for downtime, your budget for Heroku plans, and your team’s comfort level with the complexity. My advice? Always push for Solution 2 (Logical Replication) if you can. If not, carefully weigh the business cost of a long downtime (Solution 1) against the engineering cost of a more complex setup (Solution 3). And for the love of all that is holy, do a dry run in a staging environment first.

Darian Vance - Lead Cloud Architect

Darian Vance

Lead Cloud Architect & DevOps Strategist

With over 12 years in system architecture and automation, Darian specializes in simplifying complex cloud infrastructures. An advocate for open-source solutions, he founded TechResolve to provide engineers with actionable, battle-tested troubleshooting guides and robust software alternatives.


🤖 Frequently Asked Questions

âť“ What are the primary strategies for migrating a large PostgreSQL database from Heroku to AWS with minimal downtime?

The three main strategies are: a full dump and restore (high downtime), logical replication (minimal downtime, preferred), and using external trigger-based tools like Bucardo (low downtime, higher complexity).

âť“ How do the Heroku to AWS PostgreSQL migration strategies compare regarding downtime and complexity?

Full dump/restore is conceptually simple but causes massive downtime. Logical replication offers minimal downtime but is more complex to configure and requires specific Heroku plan permissions. External tools like Bucardo provide low downtime but add server management complexity and potential performance overhead on the source database.

âť“ What is a critical prerequisite to consider before attempting logical replication for a Heroku to AWS PostgreSQL migration?

Before attempting logical replication, you must verify that your Heroku Postgres plan grants sufficient privileges to create a replication user and a PUBLICATION. Without these permissions, logical replication is not possible.

Leave a Reply

Discover more from TechResolve - SaaS Troubleshooting & Software Alternatives

Subscribe now to keep reading and get access to the full archive.

Continue reading