Today, we’re thrilled to announce backups of Heroku Postgres are now 40x faster by leveraging Snapshots in place of base backups. We’ve been hard at work focused on improving performance, speed, and capacity for the Heroku Data services you rely on. In the past forks and follows of a Premium-8 test database with 992 GB of data took 22 hours; now with Snapshots, the same process is reduced to 10 minutes. This makes the creation of forks and followers, and restoring the database, faster than ever, at no additional charge.
The New Way: Snapshots for Heroku Data
In November 2020 we introduced a performance improvement to our physical backup and restore functionality for our Heroku Postgres customers. We now take snapshots in place of base backups. When we restore, we restore instances using the last snapshot taken. WAL replay from that point is still as before (using WAL-E).
- Forks and followers creation that used take sometimes up to 24 hours, now can take under 30 minutes
- The gap in HA availability during failovers or follower promotion that used to take anywhere up to 12 hours now takes under 15 minutes
Snapshots are faster than base backups, occur at the storage level, and are incremental so we can take them more frequently.
Overall, this means restoring a database is much faster now. Now with Snapshots, the rate at which we capture is dynamic. For average or low change databases we try to capture at least every 24 hours. For databases that change more frequently, however, we capture more frequently. Restoring to a snapshot that is closest to the transaction we want to restore to means less WAL replay, and a lower mean-time-to-restore.
The Old Way: Backups Was Not Built for Speed
In the past backups of Heroku Postgres relied on a WAL-E for primary backup and restoration. WAL-E is a convenience wrapper for the two conceptual parts required for disaster recovery in a PostgreSQL world:
- Base backups are required when a database is first created. This is a copy of the full existing state of the database at the time it was taken.
- WAL records are changes to the database that can be archived elsewhere, using WAL records. These are smaller pieces of data that reflect changes to a database on a low level.
To replay or restore a PostgreSQL service, we used to restore from a base backup first, then replay the WAL previously archived until the closest possible restore point is achieved. The combined processes of base backups and WAL record changes means it can take a long time to upload when new backups are made, and a long time to restore, which includes downloading the base backup from servers and replaying WAL record changes between base backups. You can read more about this backup and restore methodology in our Dev Center article on Heroku Postgres Data Safety and Continuous Protection. But clearly, this process was not built for speed, so we made it better!
Feedback Welcome
Snapshots are one of many improvements we’re making to improve your experience with Heroku Data Services. We’d love to hear from you on how this enhancement improves your workflow.