Sunday, December 14, 2008

Running builds for previous revisions

Can you run the build for a previous revision of your code? Can you do that fully automatically or do you need manual steps? Should you care?

It's good to have an automated build

As Martin Fowler says "... you should be able to walk up to the project with a virgin machine, do a checkout, and be able to fully build the system."

There are lots of benefits of having a fully automated build. It means all developers are building in the same way, reducing "it works on my machine" problems. It helps with productivity - you don't make mistakes doing the build and then have to do it again to get it right. There are other benefits too - but there are lots of other articles about this topic so I don't want to repeat them here.

Turning the dials up

What Martin Fowler doesn't say is "... do a checkout for any revision ..." (rather than just the most recent), but I think it is what you should aim to do.

Why "any revision"?

One reason you might need to run the build for any revision is in order to work out which commit broke the build in the case where you have multiple commits between builds and the build is broken. There are other reasons too - e.g. finding out what change introduced some bug that has been hiding undetected for a while (i.e. didn't cause the build to break but now you've found it and want to find out how to fix it or when/how it got introduced).

(As an aside - some CI servers can help reduce the occurrence of multiple commits between builds, but even if you are running a CI server which builds multiple revisions in parallel on multiple build agents, if the build farm isn't large enough for the number of commits and the speed of the build, then you will end up with multiple commits between builds. Or if you take the approach of having the CI server run the build before committing then you end up with a bottleneck delaying commits if you don't have enough agents.)

Can you run the build for a previous revision of your code?

If your system includes a database - how do you manage the database schema? In some teams, this is one of the dirty secrets - it's done by running some scripts manually as necessary - hence you can't run the build for a previous revision (or in some cases even the most recent revision) just by checking out the code (for that revision) and running the build.

Many teams use something like dbdeploy so changes to the database are made as delta scripts which make the necessary changes to the database in synch with code changes. Although this allows database migration forwards in time - how many teams implement the rollback scripts to allow the database to be reverted so that it is in synch with previous revisions of the code? Often you have to just rebuild the database from scratch in that case - and the build script might not do that automatically.

What if you use maven and have snapshot dependencies? How do you build the code the same as it was built an hour ago?

Why fully automate?

Any manual steps in something that you don't do frequently makes it likely that you'll find it difficult to do when you need to. Furthermore, manual steps to build previous revisions prevent you from taking advantage of all of the features of some CI servers (TeamCity, Pulse and build-o-matic all allow manually triggered building of a previous revision of the code. build-o-matic will automatically build previous revisions of your code in some circumstances, so if your build doesn't support running the build automatically for a previous revision then it won't always work as expected).

One of the logical conclusions - your development environment should be checked in

Being able to run the build for any revision of your code might require you to have more things in your source control system than you are used to. This can be controversial (even though it's mentioned in Martin Fowler's well read CI article). I like to have absolutely everything that the build depends on in the source control system - even when it isn't source.

For example, on a Java project, I like to have Java itself, Ant etc all checked in. If you change which version of Java you use (which you are more likely to do than you might think) then you want to ensure that everyone is using the same version and that you can build the project as it was in the past (if Java isn't checked in then how are you going to be sure what version it is or was)? This can be controversial (Martin Fowler mentions "Java development environment" as something that you might not check in but I disagree with him on this specific case) - but if you take the ability to check out and build the code at any point in time it to it's logical conclusion then you have to include everything you need to build it - machine set up and all. One of the advantages of taking things to this level is that it is trivial to set up a new development machine - you just check out everything and you are ready. This isn't always possible - particularly on Windows - but is a worthy aim and more achievable than you might think.

Copyright © 2008 Ivan Moore

Saturday, December 13, 2008

Software Craftsmanship conference 2009

The Software Craftsmanship conference 2009 is open for session proposals. Please submit a session proposal. I think the conference is now fully booked for attendees but you can still get in to the conference if you get your proposal accepted.

It's a great idea for a conference. I think a lot of progress has been made in our industry to improve processes. As teams become better in their processes, programming practices and code quality become relatively more important.

The timing of the conference is great - with the release of more books now that concentrate on code quality (like clean code).

Copyright © 2008 Ivan Moore

Thursday, December 11, 2008

Refactoring and TDD Training Course

I'm offering a completely new style of training course on refactoring, TDD and just programming better. It is pair programming with me (or others depending on demand).

It'll be similar cost per person per day to other courses, but instead of a classroom, it's one-to-one instruction at your site. It's also working on your code base, improving it during the course (possibly adding features too). Attendees will retain more learning that is directly applicable compared to more regular classroom courses. They will learn things from this course in addition to refactoring and TDD, like better use of the IDE, frequent commits, retro-fitting unit tests to their actual code base rather than just in the abstract.

See my "programming in the small" and "programming in the large" articles to get an idea of the sort of programming things that the course will include. (My PhD was in refactoring (completed 1996), I started using JUnit in 1998 and mock objects in 1999). Email me to book a course (and discuss course length - anything from one day upwards) - ivan at teamoptimization.com

Copyright © 2008 Ivan Moore

Wednesday, December 10, 2008

People over Process

The Agile Manifesto values "Individuals and interactions over processes and tools".

I have heard a lot of people talk about "People over Process" somewhat differently than my interpretation.

In this article, I'll refer to a fictional developer called Chuck - nothing is implied by the name.

Chuck doesn't want to write tests

A frequent interpretation seems to be, if Chuck doesn't want to write tests (for example) then that's OK because people are more important than processes or tools. A lot of the time, people talk as if the question of whether Chuck is worth keeping in the team or not isn't a consideration - you just have to do the best you can with the team you've got; after all, people are more important than processes so that's OK (a non sequitur, but I've heard it said).

This can be a difficult one to argue against. Whether someone is any good at software development or not, they are "more important" than process, tools, software - but that is a different "important".

The worst sort of "Agile"

The worst examples of "Agile" software development are where people take the approach that as long as you do the currently accepted "Agile things" (or at least some easy subset like iterations/sprints, stand up meetings etc etc) and keep your current team (because "people are more important than processes or tools") then you are "Agile". This is the opposite of my interpretation of the agile manifesto - but quite common!

My interpretation of "People over Process"


The single most important thing is to get the right people in the team. Just making your existing team follow a set of processes (like iterations/sprints, stand up meetings etc etc) is much less important than having the right people in the team. I think that is what the Agile Manifesto is trying to say.

As Simon Baker says "Put the right people in the right environment and trust them to get things done."

The consequence of this is that you should hire the best developers you can (this article describes the best way to interview them). The less palitable consequence is that you should remove developers from a team if they are not good for the team. That doesn't necessarily mean removing them from the company - it may be that there is another role or team in which they would contribute more. You should also consider whether a developer can improve - my experience has been that pair programming really helps to develop developers.

So should Chuck be made to write tests (or whatever)?

I'm not sure this is the right question. Really the question should be "is Chuck any good for the team?". If Chuck doesn't want to write tests then I think it's unlikely (but not impossible) that Chuck is (what I consider to be) a good developer. It would probably be counter productive to "make" Chuck write tests, but I'd be happy to pair program with Chuck to show him how it works. If Chuck still doesn't want to write tests then I might prefer not to be on the same team as Chuck (one way or another).

Copyright © 2008 Ivan Moore

Sunday, November 23, 2008

The problem with conventional continuous integration servers

I recently presented a talk on continuous integration at Agile North.

There was one topic in the talk that I think is so important that I decided to write an article about it. Suprisingly, it's a topic that many users of continuous integration (CI) servers (and even the developers of them in some cases!) don't seem to have thought much about.

Continuous integration

This article assumes you already know what continuous integration is.

The problem with conventional continuous integration servers

Imagine there are three developers, Tom, Dick and Harriet, and a continuous integration server. There is some code in a source code repository and these three developers start with the code cleanly checked out.

  1. Tom makes some changes and commits his changes.
  2. The CI server starts running a build.
  3. Harriet makes some changes and commits her changes.
  4. Dick makes some changes and commits his changes.
  5. The CI server reports that the build is OK.
  6. The CI server starts running a build (because there are new changes for it to run the build on).
  7. Tom makes some changes and commits his changes.
  8. The CI server reports that the build is broken.
Here's a diagram representing that:
The question is - who broke the build? (Left as an exercise for the reader). The problem with conventional continuous integration servers is that they can't tell you. A situation like this is inevitable with the vast majority of continuous integration server installations (that actually exist - I know you could install your favourite CI server differently if you had enough money - read on ...).

BTW - Step 7 is a bit of a red herring. It is not necessary for the purpose of the main point of this article, but a common enough situation. Tom thinks he's committed on a green build, but really the build is already broken in terms of the committed code, just not in terms of what the CI server is saying.

Why it is a problem

The problem with not knowing which commit broke the build is that it takes longer to work out who should look at the problem and how they can fix it. If you know which commit broke the build you know who should look at it and they can review their changes to work out why it broke. Furthermore, if you know which commit broke the build (even if you can't work out why), you can revert that change set while the problem is fixed "off line" from the rest of the team.

I am convinced (from years of using CI servers on many teams) that not knowing which commit broke the build is a major contributor to sloppy CI practice - builds staying red for ages, nobody taking responsibility, reduction in commit frequency etc etc. Just knowing which test failed, or "why" the build broke isn't enough. The symptoms don't always tell you the cause. Knowing the cause - i.e. which commit - is what you need to know.

When do you get this problem?

You suffer this problem more as the team gets larger, commits become more frequent, and the length of the build goes up. (Oh, and if developers get sloppier).

You often can't do anything about the size of the team, you'd like to encourage people to commit more frequently and making the build faster can be really hard. (And you might be able to get rid of sloppy developers, but that's quite a different blog post). Ideally the build should be fast, but that is often easier said than done, and even if the build is fast, it doesn't completely eliminate the problem described in this article.

Solutions

Most continuous integration servers don't solve this problem, but some do. These are the solutions that I know about. The CI server can:

a) run the build for the commits (revisions) between the last known good and the first known bad (provided that there is enough capacity in the build farm, e.g. people stop committing while the build is broken).
b) check that the build passes (on a build agent) before committing changes.
c) run multiple (preferably all) commits in parallel on different build agents in a build farm.

build-o-matic does (a) automatically - using a binary search. TeamCity (version 4 EAP) and Pulse (and maybe others) allow running of a previous revision so you can do the equivalent manually (maybe someone will write a plug-in for TeamCity to do what build-o-matic does? There is a precedent).

TeamCity and Pulse do (b) (build-o-matic doesn't - it's a cool feature; I've used it in TeamCity and it works well but you do need a lot of build agents).

build-o-matic does (c). I think TeamCity and possibly some others will too if you have enough build agents. The problem with this is approach is that you really might need a lot of build agents, particularly if the team is large, commit frequently and the build is long.

My preferred solution is to buy enough build agents to do (c) - computers are very cheap. Note however that just because a CI server supports a build farm, even if the build farm is infinitely large it doesn't necessarily mean it'll run the build for all the commits - check your CI server documentation for details. I believe Bamboo has something up it's sleeve on this topic - but I'm not sure if it's public yet (I'll find out and add a comment as appropriate).

There is another solution which is not to use a CI server at all, but instead have a "build token" or an "integration machine" - i.e. serialize all commits, that way you never commit while the build is running. That only works well for small co-located teams with a fast build. But when conditions are suitable, it really works well!

Conclusion

There are lots of CI servers to choose from. I consider working out which commit broke the build as a basic minimum feature of a CI server installation but suprisingly not all CI servers are capable of telling you - and you really need to understand this before you choose a CI server that will leave you with a broken build and not knowing which commit caused it.

Copyright © 2008 Ivan Moore

Sunday, November 9, 2008

In my previous article I mentioned the book "Clean Code" - this article is a brief critique of it (in some cases you'll just have to read the book to see what I'm talking about because I'm not going to rewrite the book here!).

What I liked

The first chapter is excellent - particularly pages 4-6 (buy the book to find out what they contain). I got the same feeling of wanting to get everyone to read these pages as I had from the section about comments in Kent Beck's Smalltalk Best Practice Patterns. I wanted to shout (while shaking people by their lapels) "read this - it's what I've been trying to tell you all this time".

What I didn't like - examples (particularly Chapter 3)

There are some examples that aren't great. The one that stuck out particularly badly was in chapter 3 - HtmlUtil. The problem with HtmlUtil is that it is classic procedural style - which is OK as an example of bad style - but the refactored version does not address that aspect of it's badness.

This example is a method on HtmlUtil with signature:

public static String testableHtml(PageData pageData, boolean includeSuiteSetup) throws Exception

with lots of methods called on pageData. It gets nicely refactored into a short method of a different name but with the same signature.

My main problem with this example is that it looks (to someone who doesn't know the codebase that it is taken from) as if this should be a method on PageData. Maybe there is a good reason it is a static method on HtmlUtil but the author doesn't explain whether there is or not. To me it seems like an obvious question and by not mentioning anything about Utils with static methods that do things to objects that should be quite capable of doing stuff for themselves, it shows an example of arguably bad code, early on, with no explanation or apology for using such an example. I assume that they wanted to focus only on procedural refactoring for the example and deliberately wanted to avoid anything object-oriented - but if so, they should have said that very clearly.

The second red flag that this example waved at me was declaring that the method throws Exception. It isn't clear from reading the code that it needs to. Maybe it does, but again I think if it does then the author should have explained why and apologised for that aspect of the example.

What I didn't like - Structured Programming (Chapter 3)

In the (very short) section on Structured Programming there is a claim for the rules of structured programming; "It is only in larger functions that such rules provide significant benefit." I almost screamed at my fellow passengers on the train into work. My blood pressure is rising as I write this. (Imagine me screaming at the top of my voice - "NO NO NO").

I don't believe there is ANY benefit, let alone significant benefit, of applying the rules of structured programming to Java code. If there is, then the author should justify their comments rather than invoke the "proof by repeated assertion" that the cargo cult followers will no doubt spout in comments on this article.

What I liked - formatting (Chapter 5)

Lots of good stuff - I didn't agree with it all but some of it has definitely made me rethink (or in some cases just think) about what I'm doing with formatting more than before.

What I didn't like - Chapter 11 (Systems)


Any chapter which starts with an analogy for building a software system as building a city, has already got off to a bad start as far as I'm concerned. Whenever I hear of such analogies I wonder what Swiss Toni's analogy would be. "Building a system is very much like making love to a beautiful woman ..."

Apart from the analogy - the author then goes on to talk about Spring somewhat implying that it's a Good Thing. Just don't get me started. That's a whole other blog post.

Nevertheless, despite these things, there is also some good stuff in this chapter.

Overall

Great. Buy it. It's full of good stuff and only very few things that make my blood boil.

Copyright © 2008 Ivan Moore

Saturday, November 8, 2008

Programming in the small - Exceptions

It's a long time since I wrote my previous "programming in the small" article. This article is about Exceptions in Java. I've kept it very short just to cover the absolute minimum.

Handle or throw

What is likely to be wrong with this code?
    public void makeTea() {
        teaPot.prepare();
        try {
            kettle.boil();
        } catch (ElectricityCutOffException e) {
            LOGGER.log("Didn't pay bill; no tea today.");
        }
        kettle.pourContentsInto(teaPot);
    }

Logging that some exception has been thrown is not handling it. In this case, "pourContentsInto" will still be called even if "boil" threw an exception. The exception indicates a problem, and to keep executing will probably mean that the system is in an unknown, inconsistent or bad state.

In this case, I'll end up with cold water in the teaPot, ruining the tea in an unrecoverable way.

In many cases, catching an exception and not handling it causes bugs which are tediously difficult to track down because the system ends up in an inconsistent state and the exception that eventually causes the system to fail, or the error in its functionality, ends up being somewhere that looks completely unrelated to the code that caused the problem by hiding the exception.

If this method is a sensible place to fully handle the exception, then it should do that. For example:
    public void makeTea() {
        teaPot.prepare();
        try {
            kettle.boil();
            kettle.pourContentsInto(teaPot);
        } catch (ElectricityCutOffException e) {
            butler.sendToTeaShop();
        }
    }

If it can't handle the exception (I don't have a butler), then the best thing is to just let the exception percolate up to the calling code to either handle or throw, that is:
    public void makeTea() throws ElectricityCutOffException {
        teaPot.prepare();
        kettle.boil();
        kettle.pourContentsInto(teaPot);
    }

Now calling code has to either handle or throw the exception.

Do NOT declare an exception that a method does not throw


If you declare that a method throws a checked exception that it cannot actually throw, then you are condemning callers to having to handle or throw an exception that cannot happen - percolating unnecessary try/catch code around the system. Don't do it.

For a method which implements or overrides a method of an interface or superclass which declares that it throws an exception, do not make it also declare that it throws that exception unless it actually does.

Unchecked exceptions and more

I have recently read "Clean Code" - I really like the first chapter - worth buying just for that. There is a chapter on exceptions, in which Michael Feathers writes about the use of unchecked exceptions and other things. I'm not going to write more myself - buy "Clean Code" instead.

If you are too cheap to buy "Clean Code" or have read it and want a different opinion, then read my "programming in the small" articles which cover some of the same ground (in some cases better, in some cases worse, and not a complete intersection of topics - in particular there is more stuff covered
in "Clean Code").

Copyright © 2008 Ivan Moore

Sunday, November 2, 2008

Why are NOJOs so popular?

Following on from my previous articles on NOJOs and their frequent complements, Utils classes, I have talked to colleagues about why NOJOs are so popular in enterprise Java development.

Here I will try to write up some of the ideas we discussed (thanks to Mike Hill, Nat Pryce, Pippa Newbold, Rob Dupuis and Tung Mac).

Education by frameworks

A lot of the early examples in the Spring documentation are NOJOs. For example, on the page that introduces The IoC container there are several (here's one):

package examples;

public class ExampleBean {

// No. of years to the calculate the Ultimate Answer
private int years;

// The Answer to Life, the Universe, and Everything
private String ultimateAnswer;

public ExampleBean(int years, String ultimateAnswer) {
this.years = years;
this.ultimateAnswer = ultimateAnswer;
}
}

Similarly, have a look at the JavaBeans on wikipedia - PersonBean.java is a NOJO. I think that there are a lot of developers who think that JavaBeans are NOJOs - as indicated by one of the comments on my NOJO article.

Fear of using the framework incorrectly (the framework won't like it)

It is possible that developers think that the early examples are the "correct" way to use such frameworks - and are worried that if they add methods to their NOJO then the framework will do something peculiar. This isn't entirely unreasonable as sometimes such frameworks do unexpected things due to the amount of behind-the-scenes-magic going on.

Fear of using the framework incorrectly (my colleagues won't like it)

Another possible fear is that that the early examples are the "correct" way to use such frameworks - and doing anything else is not how the framework is intended to be used (even though the framework seems to still work).

Separation of concerns

Another fear is that of putting the behaviour in the wrong place - in particular in enterprise Java projects, I think there is a perception that there is no worse "crime" than putting behaviour in the wrong "layer" (or the wrong sort of class). Therefore, rather than risk putting the behaviour on the wrong object in the right layer, enterprise Java developers will choose to put the behaviour on the wrong class in the right layer. Related to this is a desire to stick to some prescribed pattern - e.g. DTOs. I have been in a situation when pair programming with someone where I was told, "you can't add a method to that" followed by some lame reasoning justified by some pattern that they wanted to stick to religiously.

Afraid of "new"

Yet another fear is that you might have to create a new object (shock, horror!). To avoid that, maybe some developers prefer to write static methods on some Utils class so they don't need to create any objects?

Object-oriented programming education and thinking

Perhaps the popularity of NOJOs is just the manifestation of how developers (don't) learn object-oriented programming? Perhaps object-oriented programming simply doesn't suit how all developers think? Certainly, I've come across very good developers who prefer functional programming and don't really "get" object-oriented programming, so it's not meant to be a criticism (or at least, not in all cases). Many developers have learnt their programming from non object-oriented programming languages, like C. Perhaps it's not suprising that to a C programmer an object looks like a struct?

Intra-team APIs

Another source of NOJO programming is that the APIs that people design for communicating between team's subsystems often involve setting up, sending and receiving NOJOs.

Automated Testing

Another possibility is that developers who are not used to TDD find writing tests that use NOJOs easier than, for example, using Mock Objects.

UML

Using UML to "design" a system up front encourages thinking about the fields of objects rather than their behaviour - because that's what is easiest in notions like UML.

IDEs

Java IDEs can do lots of good things. One of the arguably less good things they do is generate getters and setters if you want. It is possible that the easy generation of getters and setters encourages their use, leading to NOJOs rather than objects that do things with their own fields. Using setter injection (probably the most common way people use Spring) also tempts developers into generating the getters too - after all, it's probably an extra click to not generate getters and what harm can some getters do? (Rhetorical).

Should I care that NOJOs are popular? Should I do anything about it?


Left as an exercise for the reader.

Copyright © 2008 Ivan Moore

Friday, October 24, 2008

Utils classes and NOJOs

A style of class that is a frequent complement to NOJOs (introduced in my previous article) are classes with only static methods. They might be called SomethingUtil or SomethingHelper.

People will often know what you mean if you say "a Utils" class, so I won't propose anything different here. I had wanted to propose a catchier name - SOJOs (Static Only Java Object) but the acronym (with a completely different meaning) is already taken by the SOJO project. In a comment on my NOJO article, John Q Public said he uses the term "function buckets" and I'm sure he's talking about the same thing.

Why do people write Utils classes?

If you have NOJOs then your methods have got to go somewhere. People who prefer the NOJO style often like to create static methods on Utils classes for those methods.

Are Utils classes a Good Thing?

Again, left as an exercise for the reader. An article from my old blog might help to hint about my views (although it is not specifically about this subject) without me having to stay up all night writing more. It seems like people aren't shy about expressing their opinions in comments - some of which I completely agree with, so I'll leave it as an exercise for comment writers as well as the reader (cheeky, I know).

In defence of my previous article

I didn't make any judgements in my previous article - mostly through lack of time but also because I want the term NOJO to be understood without being confused with whether it's any good or not. I am merely making an observation and suggesting a name for a thing I see frequently, that is poorly served by current names. I found it interesting how many people suggested that there was already a term for NOJO but none of their suggestions were entirely accurate.

Copyright © 2008 Ivan Moore

Thursday, October 23, 2008

Is that a POJO or a NOJO?

The term "POJO" is widely used in the Java programming world, but is sometimes used to mean something more specific than what was originally intended.

Furthermore, such code (whether called POJO or not by its authors) is a style which I think would be useful to give a specific name to.

What's a POJO?

The original meaning of POJO was a plain old Java object - that is, an object that isn't tied to a framework (for example, having to implement specific interfaces). A concrete example of a non-POJO would be entity beans (from EJB).

Introducing NOJOs

NOJOs are (instances of) classes which define fields, setters and getters but no other methods. They are popular in enterprise Java code and I think deserve a specific term, to distinguish them from the rather general term POJO.

NOJO stands for "Non Object-oriented Java Object". Although it might sound "negative" it is merely intended to be accurate and have a catchy acronym.

What defines an object?

An "object" (in object-oriented programming) has identity, state and behaviour. A NOJO has identity and state. A function has behaviour but not state. An object has identity, state and behaviour.

Why do you need a term for NOJO?

In many programming languages, there is a language construct for NOJOs (or something very similar) - e.g. "struct" in C, or "record" in Pascal. In Java there isn't an equivalent language construct - so the term NOJO is intended to mean the use of a Java class to implement such a thing, to make it easier to talk about such code.

Are NOJOs a Good Thing?

Left as an exercise for the reader.

Sunday, October 12, 2008

Blog moved here

Joe Walnes, the best developer I've ever worked with, got me started blogging (many thanks Joe) by hosting my blog on his server. Now the time has come for him to retire his server, and so my blog has moved here. I'll be writing more articles soon - I've got a backlog of programming in the small ideas to write up - but have been very busy with work, being (co) programme chair of my favourite conference (SPA), build-o-matic and just trying to keep up with the world.