Thursday, July 9, 2009

Cleaner code

I recommend the book "Clean Code" - I completely agree with its philosophy and have written a few articles of my own about "programming in the small".

Mike Hill and I presented a session on "programming in the small" at the "Software Craftsmanship" conference 2009 in London and at QCON London 2009 - and one of or examples was taken from "Clean Code".

Clean Code

One of the examples in "Clean Code" is some code for parsing command line arguments, written in Java by Bob Martin. It is shown as an example of code which has already been made clean. There is an article (separate from the book) by Bob Martin about the code we use for this example. The source code is available from github:

git clone git://github.com/unclebob/javaargs.git

Mike and I chose this code because we wanted to take code which was already quite clean and show how even good code can sometimes be cleaned up even more - we wanted to push the cleanliness as far as possible in the session.

Just looking at the "Args" class only, this method seemed the one most wanting some further work (sorry, formatting is ugly - reformatted to fit blog page width):

private void parseSchemaElement(String element)
throws ArgsException {
char elementId = element.charAt(0);
String elementTail = element.substring(1);
validateSchemaElementId(elementId);
if (elementTail.length() == 0)
marshalers.put(elementId,
new BooleanArgumentMarshaler());
else if (elementTail.equals("*"))
marshalers.put(elementId,
new StringArgumentMarshaler());
else if (elementTail.equals("#"))
marshalers.put(elementId,
new IntegerArgumentMarshaler());
else if (elementTail.equals("##"))
marshalers.put(elementId,
new DoubleArgumentMarshaler());
else if (elementTail.equals("[*]"))
marshalers.put(elementId,
new StringArrayArgumentMarshaler());
else
throw new ArgsException(INVALID_ARGUMENT_FORMAT,
elementId, elementTail);
}

So - what could be better? What struck Mike and I was the duplication of "marshalers.put(elementId, new XXX());"

To remove this duplication, first extract a variable called argumentMarshaler of type ArgumentMarshaler in each of the branches, then move the expression "marshalers.put(elementId, argumentMarshaler);" outside of the if statement, and the declaration of the variable "argumentMarshaler" before the if statement. You end up with:

private void parseSchemaElement(String element)
throws ArgsException {
char elementId = element.charAt(0);
String elementTail = element.substring(1);
validateSchemaElementId(elementId);
ArgumentMarshaler argumentMarshaler;
if (elementTail.length() == 0) {
argumentMarshaler =
new BooleanArgumentMarshaler();
} else if (elementTail.equals("*")) {
argumentMarshaler =
new StringArgumentMarshaler();
} else if (elementTail.equals("#")) {
argumentMarshaler =
new IntegerArgumentMarshaler();
} else if (elementTail.equals("##")) {
argumentMarshaler =
new DoubleArgumentMarshaler();
} else if (elementTail.equals("[*]")) {
argumentMarshaler =
new StringArrayArgumentMarshaler();
} else
throw new ArgsException(INVALID_ARGUMENT_FORMAT,
elementId, elementTail);
marshalers.put(elementId, argumentMarshaler);
}

Now it's clearer that this method is doing too many things - one of those things being to find the relevant marshaler. We can extract a method for finding the marshaler and, having been extracted can improve it further by removing unnecessary structured programmingness, ending up with:

private void parseSchemaElement(String element)
throws ArgsException {
char elementId = element.charAt(0);
String elementTail = element.substring(1);
validateSchemaElementId(elementId);
marshalers.put(elementId,
findAppropriateArgumentMarshaler(elementId,
elementTail));
}

private ArgumentMarshaler findAppropriateArgumentMarshaler(
char elementId, String elementTail)
throws ArgsException {
if (elementTail.length() == 0) {
return new BooleanArgumentMarshaler();
} else if (elementTail.equals("*")) {
return new StringArgumentMarshaler();
} else if (elementTail.equals("#")) {
return new IntegerArgumentMarshaler();
} else if (elementTail.equals("##")) {
return new DoubleArgumentMarshaler();
} else if (elementTail.equals("[*]")) {
return new StringArrayArgumentMarshaler();
} else
throw new ArgsException(INVALID_ARGUMENT_FORMAT,
elementId, elementTail);
}

Notice that "elementId" is only used for the exception to throw if an appropriate ArgumentMarshaler cannot be found. Further investigation shows that actually this parameter isn't used in ArgsException for an "INVALID_ARGUMENT_FORMAT" and we can just delete this argument (and hence the "elementId" parameter of "findAppropriateArgumentMarshaler") with no change to the behaviour of the code!

There is more that can be done with the Args class - but that is beyond the scope of this article.

Conclusion

I hope this has shown that even clean code can be made cleaner - without having to do anything too clever or drastic. It's often easier to see such opportunities in other people's code or code you haven't seen for a while - even the best programmers can miss opportunities for making code simpler.

Copyright © 2009 Ivan Moore

5 comments:

J. B. Rainsberger said...

I would clean something else up: replace all string.equals("literal") with "literal".equals(string), because the former can throw on null, but the latter can only return false.

Caleb Cornman said...

Nice job! I actually had the issue today that J. B. Rainsberger spoke of. It was the exact same situation with comparing a value that could be null to a string literal. Better to switch the comparison around so that you can't get a null exception.

Nilanjan Raychaudhuri said...

I would replace all the else if with if (like guard clause). Then if element trail could be enum then you would take out all the if conditions(http://www.antiifcampaign.com/) and make enum return corresponding ArgumentMarshaller

Unknown said...

I am not a Java developer, but as an old Smalltalker, the multi-if section of findAppropriateArgumentMarshaler looks a lot like something we'd use a Dictionary(HashTable?) for. It sounds similar to what nilanjan is proposing.

Dave Cleal said...

I just came across this string.equals("literal") versus "literal".equals(string) thing at work, so I was interested to see it here too. I think the general case of this is overstated.

It seems there are two cases. First, where the string might be null in correct program execution, in which case obviously "literal".equals(string) is the only non-buggy version. Second, where the string should never be null, in which case string.equals("literal") is better, because bugs are exposed sooner and closer to their source.