Saturday, January 24, 2009

Making fields and methods as private as possible - a postmodern Thatcher.

Fields and methods should be declared as private as possible. As I mentioned in an article on the subject "I thought of writing a tool to do this that I was going to call "Thatcher" - it would privatize as much as possible".

I only realized a few days ago that I've already written it (I really am quite dense), and in fact had written it several years before writing that article.

How do you tell if fields and methods can be made private?

I think the most popular ways for people to tell are:
  • look for all references to the field or method and see if any are outside the class it is declared in
  • change "public" to "private" and see if it fails to compile (e.g. red Xs in eclipse)
Unfortunately, both of these approaches have limitations for most enterprise java applications (particularly web applications). These approaches don't work if the field or method in question is used reflectively, e.g. by a template (e.g. FreeMarker, JSP etc). That's the limitation of a static analysis implementation of "Thatcher".

If you are paranoid, how do you tell if fields and methods can be made private?

The only safe way to tell, for a field or method that you want to see if it can be private, is to make it private and then see if the application still works.

How to automate that?

If you have good test coverage, then "the application still works" is automated as "the build (including tests) passes" (e.g. the build that you use on your CI server).

Then you are left with the problem "how do I tell if "public" can be changed to "private" and the build (including tests) still passes?".

Now that sounds familiar to me - it's rather close to what Jester does.

Jester - a mutation testing tool

I wrote Jester as a mutation testing tool. As it says on the web site: "Jester makes some change to your code, runs your tests, and if the tests pass Jester displays a message saying what it changed." The idea is that if you can change your code and the tests still pass then the tests aren't covering the code (other possibilities are that it's a behaviour preserving change, or the code is redundant).

Jester as a postmodern Thatcher

Jester can be used to implement Thatcher by configuring it to mutate "public" to "private" and "protected" to "private" (package visibility is harder!). I think this could be described as a postmodern programming approach - (re)using some code to do something it wasn't designed for.

I hadn't thought of doing this until it occurred to me a few days ago (and now it seems so obvious) because I had imagined that Thatcher would use static analysis and hence be very different (that's my modernist thinking, it would have to use static analysis to be a fast and "proper" implementation and the fact that it wouldn't always be correct would be a fault of the imperfect world).

It might be that someone has proposed this before (in which case, sorry, I don't remember).

Limitations of Thatcher implemented using Jester

It should be noted that using Jester for implementing Thatcher only works if your build (including tests) has sufficient coverage that you are happy that if it passes then that means the application works.

Also note that using Jester for this could take a really long time. If your build takes an hour then it would take many machine days to privatize even a very small amount of code.

An example run of Thatcher implemented using Jester

Here's an example of using the latest version of Jester ("simple jester" also described here) as Thatcher. For this example, I'm using Checkstyle as the code base for which I want to make fields and methods as private as possible. I chose Checkstyle for this example because the build is very fast!

(To try this out for yourself, you will need Java installed with java, javac and ant on the path. These instructions are for Windows. Replace <wherever> as appropriate).
  • download checkstyle
  • unzip
  • cd to <wherever>checkstyle-src-5.0-beta01
  • in order to check that everything is OK so far, execute "ant run.tests"
should get:

BUILD SUCCESSFUL
Total time: 33 seconds

(or whatever time)
  • download Jester (the simple-jester variety)
  • unzip
  • cd to <wherever>simple-jester-1.1
  • for the step below to work, execute "setcp.bat"
  • in order to check that everything is OK so far, execute "test.bat"
should get:

17 mutations survived out of 19 changes. Score = 11
took 0 minutes

(or whatever time)

In order to check that the checkstyle build can be run from the jester directory (i.e. still in <wherever>simple-jester-1.1) (which makes it simpler for running Jester in this example)
  • ant -f <wherever>checkstyle-src-5.0-beta01\build.xml run.tests
should get:

BUILD SUCCESSFUL
Total time: 21 seconds

(or whatever time)

Now edit "mutations.cfg" (in <wherever>simple-jester-1.1) and replace contents with:

%public%private
%protected%private

That configures Jester to try mutating "public" to "private" and "protected" to "private". Unfortunately, Jester cannot currently be configured to not mutate literal numbers - so in order to avoid Jester making mutations to literal numbers you have to be a bit cunning.

Create a text file in folder "<wherever>simple-jester-1.1\\jester" called SimpleIntCodeMangler.java with contents:

package jester;

public class SimpleIntCodeMangler implements CodeMangler {
public SimpleIntCodeMangler(ClassSourceCodeChanger sourceCodeSystem) {
}

public boolean makeChangeToClass() throws SourceChangeException {
return false;
}
}

Then (in <wherever>simple-jester-1.1) execute "javac jester\SimpleIntCodeMangler.java".

If "." is on the classpath before simple-jester.jar then the new version of SimpleIntCodeMangler will replace the version in simple-jester.jar, so Jester won't do the literal number mutations.

Now to run Jester/Thatcher (just on the one class "Checker"), execute:

java jester.TestTester "ant.bat -f <wherever>checkstyle-src-5.0-beta01\build.xml run.tests" <wherever>checkstyle-src-5.0-beta01\src\checkstyle\com\puppycrawl\tools\checkstyle\Checker.java

(note that on windows it has to be "ant.bat" and not just "ant")

Now Jester will run for a while, and eventually you get:

4 mutations survived out of 24 changes. Score = 84
took 5 minutes

If you want a simple way to visualize the results, then run "python makeWebView.py" and have a look at "jester.html".

To get the code changed to make methods and fields private which can be made private, then run "python makeAllChangesFiles.py jesterReport.xml" and you get a version of Checker.java called Checker.jester (in the same folder, i.e. "<wherever>checkstyle-src-5.0-beta01\src\checkstyle\com\puppycrawl\tools\checkstyle\") which has all the privatizations where the tests still pass.

The results for Checker.java show that addFileSetCheck, setModuleFactory, setSeverity and setClassloader can be made private and the "ant run.tests" build still passes. Having had a look at the code - it might be that these methods need to be public for something not covered by the tests but are needed for the application to work (because the only tests are unit tests and not end-to-end tests; this is a limitation mentioned earlier).

And finally

If you like that (or even if you don't) then please donate to my 5 countries (300 miles) in 3 days bike ride to raise money for the National Autistic Society.

Copyright © 2009 Ivan Moore

1 comment:

StuartE said...

I think you are dense too.