I have been working on a >100k lines legacy project for a while now. We have to juggle customer requests, bug fixes and refactoring so it is hard to improve the quality and employ new techniques or tools while keeping the software running and the clients happy. Initially there were no unit tests and most of the code had a gigantic cyclomatic complexity. Over the course of time we managed to put the system under continuous integration, employed quite some unit tests and analyzed code “hotspots” and our progress with crap4j.
Normally we get bug reports from our userbase or have to test manually to find bugs. A few weeks ago I tried a new approach to bughunting in legacy projects using FindBugs. Many of you surely know this useful tool, so I just want to describe my experiences in that project using FindBugs. Many of the bugs may be in parts of the application which are seldom used or only appear in hard to reproduce circumstances. First a short list of what I encountered and how I dealt with it.
Interesting found bugs in the project
- There was a calculation using an integer division but returning a double. So the actual computation result was wrong but yet the error would have been hard to catch because people rarely recalculate results of a computer. When writing the test associated to the bugfix I found a StackOverFlowError too!
- There were quite some null dereferences found, often in contructs like
if (s == null && s.length() == 0)
if (s == null || s.length() == 0)
which could be simplified or rewritten anyway. Sometimes there were possibilities for null dereferences on some paths despite of several null checks in the code.
- Many performance bugs which may or may not have an effect on overall performance of the system like: new String(), new Integer(12), string concatenation across loops, inefficient usage of java.util.Map.keySet() instead of java.util.Map.entrySet() etc.
- Some dead stores of local variables and statements without effect which could be thrown away or be corrected to do the intended things.
Things you may want to ignore
There are of course some bugs that you may ignore for now because you know that it is a common pattern in the team and abuse and thus errors are extremely unlikely. I, for example, opted to ignore some dozens of “may expose internal representation” found bugs regarding arrays in interfaces or accessibly via getters because it is a common pattern on the team not to tamper existing arrays as they are seen as immutable by the team members. It would have taken too much time to fix all those without that much of a benefit.
You may opt to ignore the performance bugs too but they are usually easy to fix.
- If you have many foundbugs, fix the easy ones to be able to see the important ones more easily.
- Ignore certain bug categories for now, fix them later, when you stumble upon them.
- Concentrate on the ones that lead to wrong behaviour and crashes of your application.
- Try to reproduce the problem with unit test and then fix the code whenever feasible! Tests are great to expose the bug and fix it without unwanted regressions!
- Many bugs appear in places which need refactoring anyway so here is your chance to catch several flies at once.
With FindBugs you can find common programming errors sprinkled across the whole application in places where you probably would not have looked for years. It can help you to understand some common patterns of your team members and help you all to improve your code quality. Sometimes it even finds some hard to spot errors like the integer computation or null dereferences on certain paths. This is even more true in entangled legacy projects without proper test coverage.