Imagine a world without the open source database migrations tool, Flyway. Migration scripts being run in the wrong order, changes getting overwritten, having to dump hours of work because you started with an out-of-date version. Flyway is so popular today because it sorts out the issue of database migrations in such a simple and elegant way. No one is going back to how things used to be done.
Getting your database migrations process under control is a big win, but schema migrations are only half the story. The data within the database can have huge implications on the behavior of your SQL scripts. If things go wrong, then in the best case, the migration might fail to execute. In the worst case, you lose data, your service goes down, and you make your customers unhappy.
Just like in the days before Flyway, when the old way of migrating schema changes was clunky and slow, failing to fix bottlenecks in bringing data into the dev/test cycle is holding your migrations back. Solve this too and the reward is higher quality code, faster and more reliable releases, less rework and evening fixes, and extra time with friends and family.
Data bottlenecks are holding your migrations back
Bringing data – real data, mirroring production – into your dev/test cycles is difficult. The size of the data in your databases, alone, presents some significant challenges around provisioning copies to work and test with. Not only do you need to work with infrastructure teams to find the disk space to restore a backup of the database to, you also need to find the time for this process to complete.
And each time you want to refresh the data, you need to put a request into the DBA to repeat the process. Having to wait hours or days for database copies is hardly conducive to agile development. It’s why most teams share environments or test with empty databases or small amounts of data which, in turn, impacts the database development process.
Impact 1: Sharing development and testing databases
Shared development environments risk team members overwriting each other’s changes or coding against something written by another developer that’s only partially finished. Worse, they limit creativity and prevent developers from experimenting with new ways to solve customer problems.
For a team working with Flyway, a shared database presents another potential problem: when developers are writing and applying migration scripts against it, it can be very disruptive to other developers who have to work with a database that’s changing under their feet. Good communication can’t always be relied on to prevent things like this happening, especially in today’s increasingly distributed and remote working environment.
Even when using dedicated development databases, other members in the team will be authoring migration scripts and sharing them via a version control system. However, sharing migration scripts “blind” is risky as, just because it works on one developer’s machine doesn’t mean it will work in combination with other ‘new’ migration scripts. So it’s important to test any new migration script along with all other migration scripts. The only way to do this is to run them in sequence on a test database. It’s hard for developers to have a single development database that’s uniquely theirs to work with, let alone a second copy for the purposes of testing.
Impact 2: CI testing with empty databases or small amounts of data
Running tests against databases in CI pipelines is an essential part of testing your application, and ideally databases are spun up and torn down instantly for each pipeline. But for any database approaching a gigabyte in size, it takes too long to restore a backup, so empty databases are used, or ones that contain a small amount of seed data. This is problematic because testing is then done on databases that don’t resemble production, which doesn’t provide the required confidence that a subsequent deployment won’t cause issues.
Sharing a test database containing the full data is an option but this effectively serializes the pipelines as they contend for the shared database. Scaling may be possible by adding multiple database servers, but ultimately the parallelism of your CI pipelines will be limited by the number of database servers you have available to them.
Spawn is the key to freeing your migrations from data bottlenecks
Spawn is a new solution that does for data what Flyway did for schema migrations – simplify it and reduce errors, freeing up you and your team to concentrate on building great software.
Spawn enables you to spin up database copies in seconds without the hassle of managing any hardware or disks. This is made possible by a revolutionary ‘database-cloning-as-a-service’. Unlike other database cloning solutions that require infrastructure setup and staging servers, Spawn is ready to go as soon as you are. Simply upload your backup into Spawn’s secure cloud and start creating copies for development, testing and CI.
Spawn enables you to work with as many copies of your databases as you need. All isolated from other developers’ copies, all exact replicas of their source, data included. Your team is able to make changes without the risk of disrupting each other and spin up as many test databases as required to validate migrations and assess the performance impact of their work.
Databases in Spawn can be stood up and torn down on every commit, giving you and your team the confidence that your migrations will successfully deploy to production. And in the case of an error, it takes seconds to provision an instance for troubleshooting, enabling you to get back on track fast. Add to that the ability to invoke the Spawn command line via GitHub Actions and you get smooth, frictionless database CI in your workflows.
Spawn is free to use
Spawn is available as a free service and supports MySQL, SQL Server, PostgreSQL, MongoDB and Redis. For Flyway Teams customers, there is a Teams version of Spawn designed for collaborative use cases. Both versions enable you to create, save and tear down your databases as often as you like, spin up multiple copies for different branches and automate database provisioning in your CI pipelines. All without any hardware or infrastructure requirements for data on your side.
If you’d like to catch data-related issues early and release changes faster with Spawn, find out more.