You may have heard of crap4j when it was still actively developed. Crap4j is a software metric that points you to “crappy” methods in your projects by combining cyclomatic complexity numbers with test coverage data. The rationale is that overly complex code can only be tamed by rigorous testing or it will quickly reduce to an unmaintainable mess – the feared “rotten code” or “crappy code”, as Alberto Savoia and Bob Evans, the creators of crap4j would put it. The crap4j metric soon became our most important number for every project. It’s highly significant, yet easy to grasp and mandates a healthy coding style.
Some enhancements to crap4j
Crap4j got even better when we developed our own custom enhancements to it, like the CrapMap or the crap4j hudson plugin. We have a tool that formats the crap4j data like cobertura’s report, too.
A minor imperfection
The only thing that always bugged me when using crap4j inside our continuous integration build cycle was that at least half the data was already gathered. Cobertura calculates the code coverage of our tests right before crap4j does the same again. Wouldn’t it be great if the result of the first analysis could be re-used for the crap metric to save effort and time?
Different types of coverage
Soon, I learnt that crap4j uses the “path coverage” to combine it with the complexity of a method. This is perfectly reasonable given that the complexity determines the number of different pathes through the method. Cobertura only determines the “line coverage” and “branch coverage”. As it stands, you can’t use the cobertura data for crap4j because they represent different approaches to measure coverage. That’s still true and probably will be for a long time. But the allurement of the shortcut approach was too high for me to resist. I just tried it out one day to see the real difference.
A different metric
So, here it is, our new metric, heavily inspired by crap4j. I just took the line and branch coverage for every method and multiplied them. If you happen to have a perfect coverage (1.0 on both numbers), it stays perfect. If you only have 75% coverage on both numbers, it will result in a “crapertura coverage” of 56,25%. Then I fed this new coverage data into crap4j and compared the result with the original data. Well, it works on my project.
Encouraged by this result, I wrote a complete ant task that acts similar to the original crap4j ant task. You can nearly use it as a drop-in replacement, given that the cobertura XML report file is already present. Here is an example ant call:
<crapertura coberturaReportFile="/path/to/cobertura/coverage.xml" targetDirectory="/where/to/place/the/crap4j/report" classesDirectory="/your/unarchived/project/class/files" />
It will output the usual crap4j report files to the given target directory. Please note that even if it looks like crap4j data, it’s a different metric and should be treated as such. Therefore, online comparison of numbers is disabled.
The whole project is published on github. Feel free to browse the code and compile it for yourself. If you want a binary release, you might grab the latest jar from our download server.
The complete usage guide can be found on the github page or inside the project. If you have questions or issues, please use the comment section here.
If crapertura is able to give you nearly the numbers that crap4j gave you is up to your project, really. Our test project contained over 20k methods, but very little crap. The difference between crap4j and crapertura was negligible. Both metrics basically identified the same methods as being crappy. Your mileage may vary, though. If that’s the case, let us know. If your experience is like ours, you’ve just saved some time in your build cycle without sacrificing quality.