Saturday, February 1, 2014

Making life easier for future software archaeologists

Yesterday I went to "The First International Conference on Software Archaeology" run by Robert Chatley and Duncan McGregor - it was excellent. There were "lightning talks" run by Tim Mackinnon - here is a blog version of my talk.


If you have worked on a piece of software that is running in production, but hasn't been changed for a while, you may have had to do some software archaeology to work out how to make changes to it.

In this article, I list some problems that I've encountered when doing software archaeology, and some suggestions for making life easier for future software archaeologists.

My suggestions are not always applicable - but please consider them carefully. It is valuable to your client to make it easier for future software archaeologists to work with your systems. If your systems are any good they will probably be used for much longer than you think.

Where's the source?

Sometimes source code is lost (e.g. because of a VCS migration and some repositories don't get converted because nobody thinks they are needed any more). For Java projects there is a simple way to avoid losing the source code - include the source in your binary jars.

Where is the documentation?

Although it is possible that the source code will be lost, more commonly, source code repositories do survive. However, documentation systems (for example wikis) are likely to be decommissioned sooner.

Even if a documentation system isn't decommissioned, the information related to old projects can get deleted, become out of date or inconsistent with the version actually running.

In order to keep documentation consistent with the system, please commit it to the same VCS repository as the code. Depending on the VCS system used, you might be able to serve documentation to users directly from your source control system.

Where are the dependencies?

Sometimes artifact repositories are decommissioned. For Java projects, instead of using an artifact repository and ivy/maven/gradle etc, commit your dependencies into a "lib" folder and refer to them there. I know this is very controversial approach - it goes against current trends, but is actually a very practical approach. It is likely that the source code repository will outlive the artifact repository.

How do I build the software?

Sometimes build tools go out of fashion and it is difficult to set up a working build for archaeological code. Therefore, at the very least, include instructions about how to build the code in the VCS repository. Even better for the future archaeologist, commit the build tools and any setup scripts.

How do I work on the software?

In addition to being able to build the software, there may be development tools needed to work on it. For example, if the software is partially generated by some (usually hideous) tool. In such cases, some of the source code isn't really what the developer works on (e.g. GUI builder generated code).

Include (at the very least) instructions for how to set up a suitable development environment. Even better, commit the development tools and any setup scripts.

How do I run the code in the production environment?

For a large system, it can be difficult to work out how the production servers are meant to be set up. Therefore, include instructions, or even better, scripts (like Puppet or Chef), for setting up any servers etc.

How did it get to be like it is?

When looking at an old system, can be useful to see the history of decisions about how a system got to be like it is. It can be useful to have a changelog checked into the source code repository. In my lightning talk, Nat Pryce said that for a home project, he committed the complete bug tracker system; that could be very useful for a future archaeologist.

In conclusion

Fashions change (e.g. tools become obsolete), reorganizations happen and systems get migrated (and sometimes things get lost in the process). If you want to do the best for your client, remember that successful software can last a really long time, so you should leave better clues for future software archaeologists.

Copyright © 2014 Ivan Moore


Nat Pryce said...

The issue tracker I wrote that stores its state in the version control repository is Deft:

Admin said...
This comment has been removed by a blog administrator.
Kanye Co Jamila said...

Thanks for the post, I am techno savvy. I believe you hit the nail right on the head. I am highly impressed with your blog. It is very nicely explained. Your article adds best knowledge to our Java Online Training from India. or learn thru Java EE Online Training Students.