Sunday, December 14, 2008

Running builds for previous revisions

Can you run the build for a previous revision of your code? Can you do that fully automatically or do you need manual steps? Should you care?

It's good to have an automated build

As Martin Fowler says "... you should be able to walk up to the project with a virgin machine, do a checkout, and be able to fully build the system."

There are lots of benefits of having a fully automated build. It means all developers are building in the same way, reducing "it works on my machine" problems. It helps with productivity - you don't make mistakes doing the build and then have to do it again to get it right. There are other benefits too - but there are lots of other articles about this topic so I don't want to repeat them here.

Turning the dials up

What Martin Fowler doesn't say is "... do a checkout for any revision ..." (rather than just the most recent), but I think it is what you should aim to do.

Why "any revision"?

One reason you might need to run the build for any revision is in order to work out which commit broke the build in the case where you have multiple commits between builds and the build is broken. There are other reasons too - e.g. finding out what change introduced some bug that has been hiding undetected for a while (i.e. didn't cause the build to break but now you've found it and want to find out how to fix it or when/how it got introduced).

(As an aside - some CI servers can help reduce the occurrence of multiple commits between builds, but even if you are running a CI server which builds multiple revisions in parallel on multiple build agents, if the build farm isn't large enough for the number of commits and the speed of the build, then you will end up with multiple commits between builds. Or if you take the approach of having the CI server run the build before committing then you end up with a bottleneck delaying commits if you don't have enough agents.)

Can you run the build for a previous revision of your code?

If your system includes a database - how do you manage the database schema? In some teams, this is one of the dirty secrets - it's done by running some scripts manually as necessary - hence you can't run the build for a previous revision (or in some cases even the most recent revision) just by checking out the code (for that revision) and running the build.

Many teams use something like dbdeploy so changes to the database are made as delta scripts which make the necessary changes to the database in synch with code changes. Although this allows database migration forwards in time - how many teams implement the rollback scripts to allow the database to be reverted so that it is in synch with previous revisions of the code? Often you have to just rebuild the database from scratch in that case - and the build script might not do that automatically.

What if you use maven and have snapshot dependencies? How do you build the code the same as it was built an hour ago?

Why fully automate?

Any manual steps in something that you don't do frequently makes it likely that you'll find it difficult to do when you need to. Furthermore, manual steps to build previous revisions prevent you from taking advantage of all of the features of some CI servers (TeamCity, Pulse and build-o-matic all allow manually triggered building of a previous revision of the code. build-o-matic will automatically build previous revisions of your code in some circumstances, so if your build doesn't support running the build automatically for a previous revision then it won't always work as expected).

One of the logical conclusions - your development environment should be checked in

Being able to run the build for any revision of your code might require you to have more things in your source control system than you are used to. This can be controversial (even though it's mentioned in Martin Fowler's well read CI article). I like to have absolutely everything that the build depends on in the source control system - even when it isn't source.

For example, on a Java project, I like to have Java itself, Ant etc all checked in. If you change which version of Java you use (which you are more likely to do than you might think) then you want to ensure that everyone is using the same version and that you can build the project as it was in the past (if Java isn't checked in then how are you going to be sure what version it is or was)? This can be controversial (Martin Fowler mentions "Java development environment" as something that you might not check in but I disagree with him on this specific case) - but if you take the ability to check out and build the code at any point in time it to it's logical conclusion then you have to include everything you need to build it - machine set up and all. One of the advantages of taking things to this level is that it is trivial to set up a new development machine - you just check out everything and you are ready. This isn't always possible - particularly on Windows - but is a worthy aim and more achievable than you might think.

Copyright © 2008 Ivan Moore

No comments: