2020-05-29 by Julia Hayward
When you’re on your own developing a simple application, managing your migration scripts is not a demanding task; each
time you add a script, it gets the next version number in whatever sequence you’ve decided on using. When your
application takes off and you find you have a team working on migrations simultaneously, sharing code via a
distributed source control system such as
git, then there is an increased risk of running into problems as the
developers work on the same database migration code at the same time.
Let’s say you’ve been asked to add a new feature to your application, and this needs a new column to be added to the
database. The database schema is currently at v4; so the natural thing to do is create a script
and check it into source control. So far, all is good.
Meanwhile, your colleagues are also working on a new feature and have added a script
V5__MyNewTable.sql. As far
as they’re concerned, all appears good too. When the two sets of changes make it into the trunk branch ready
for testing and deployment, source control will see them as two unrelated files and therefore will not detect any
problem. However, Flyway will see two files with the same schema version number and will fail validation.
So you decide to sidestep the above problem by agreeing in advance that you will write script
your colleagues will write
V6__MyNewTable.sql. You add your new column to the database in your script, and once again
all is good. However, a colleague working on his local branch can’t see your change and as a result adds some SQL into a script
that depends on the old definition of the table you’ve altered - for example an
INSERT INTO ... which fails to specify
a value for your new column. Once again, all appears good too. But when the scripts are merged
and run, the second script will fail.
So you go one step further and warn your colleagues about what you plan to do. They write their scripts defensively, and get their changes through code review more quickly than yours. By the time you come to merge, the other changes are in production. Again, everything seems to be good locally but when your changes are merged, Flyway sees a new script that is numbered lower than the current schema version and errors.
git allows you to customise its lifecycle by adding callbacks known as hooks - scripts that are executed at particular
points in its processing. These scripts can interrupt
git if they detect conditions that are problematic but which
git is not aware of by itself. This means that we can get Flyway to pick up on potential errors in our scripts
either when we commit them locally, or when we push our scripts to the remote repository.
Writing a git hook is very simple - we just add a shell script to the
.git\hooks folder inside our source repository
with the appropriate name, and make sure it is executable by the git process. As an example, here’s a
hook that will verify that I don’t have any duplicate migration version numbers:
#!/bin/sh if flyway validate -url=jdbc:h2:mem:dummydb \ -locations=filesystem:/mnt/c/src/sandbox/gitHookDemo/ \ -ignorePendingMigrations=true; then echo Flyway validation successful exit 0 else echo Flyway validation failed exit 1 fi
For this task we’re not interested in a real database connection, so we use a new H2 in-memory database on each run.
We’re also not interested in checking whether all migrations have been run - as we know they haven’t - so we
ignorePendingMigrations option. Now, when I have finished working on a script and come to commit it,
git commit will invoke
flyway validate, and if there’s a problem with my script naming then
the commit will fail. The error won’t propagate to the remote repository at all, preventing problems for other users
down the line.
Validation is usually a quick process; if we’re prepared to spend a bit more time up front in order to catch more problems then we can write a hook that runs a full clean/migrate cycle on an expendable test database:
#!/bin/sh flyway clean -url=jdbc:<local-test-db-url> \ -locations=filesystem:/mnt/c/src/sandbox/gitHookDemo/ ; if flyway migrate -url=jdbc:<local-test-db-url> \ -locations=filesystem:/mnt/c/src/sandbox/gitHookDemo/ ; then echo Flyway migration successful exit 0 else echo Flyway migration failed exit 1 fi
These checks are local ones, and we can invoke them at times when errors are likely to occur;
to committing a new script;
post-merge just after pulling new files from the remote repository or merging a
branch which might have conflicting scripts in; and
pre-push prior to submitting back to the remote repo. We
can also set up our remote repository as a gatekeeper with a
pre-receive hook that rejects attempts to push
a script that wouldn’t update a copy of the production database correctly. By using these judiciously, we can
trap most common problems early, and fix them while it’s still easy to do so.