A mindset for inherited source code

This article outlines a mindset for developers to deal with existing, probably inherited code bases. You’ll have to be an archeologist, a forensicist and a minefield clearer all at once.

One field of expertise our company provides is the continuation of existing software projects. While this sounds very easy to accomplish, in reality, there are a few prerequisites that a software project has to provide to be continuable. The most important one is the source code of the system, obviously. If the source code is accessible (this is a problem more often than you might think!), the biggest hurdle is now the mindset and initial approach of the developers that inherit it.

The mindset

Most developers have a healthy “greenfield” project mindset. There is a list of requirements, so start coding and fulfill them. If the code obstructs the way to your goal, you reshape it in a meaningful manner. The more experience you have with developing software, the better the resulting design and architecture of the code will be. Whether you apply automatic tests to your software (and when) is entirely your decision. In short: You are the master of the code and forge it after your vision. This is a great mindset for projects in the early phases of development. But it will actively hinder you in later phases of your project or in case you inherit foreign code.

For your own late-phase projects and source code written by another team, another mindset provides more value. The “brownfield” metaphor doesn’t describe the mindset exactly. I have three metaphors that describe parts of it for me: You’ll need to be an archeologist, a forensicist (as in “securer of criminal evidence”) and a minefield clearer. If you hear the word archeologist, don’t think of Indiana Jones, but of somebody sitting in the scorching desert, clearing a whole football field from sand with only a shaving brush and his breath. If you think about being a forensicist, don’t think of your typical hero criminalist who rearranges the photos of the crime scene to reveal a hidden hint, but the guy in a white overall who has to take all the photos without disturbing the surrounding (and being disturbed by it). If you think about the minefield clearer: Yes, you are spot on. He has to rely on his work and shouldn’t move too fast in any direction.

The initial approach

This sets the scene for your initial journey inside foreign source code: Don’t touch anything or at least be extra careful, only dust it off in the slightest possible manner. Watch where you step in and don’t get lost. Take a snapshot, mental or written, of anything suspicious you’ll encounter. There will be plenty of temptation to lose focus and instantly improve the code. Don’t fall for it. Remember the forensicist: what would the detective in charge of this case say if you “improved the scenery a bit” to get better photos? This process reminds me so much of a common approach to the game “Minesweeper” that I included the minefield clearer in the analogy. You start somewhere on the field and mark every mine you indirectly identify without ever really revealing them.

Most likely, you don’t find any tests or an issue tracker where you can learn about the development history. With some luck, you’ll have a commit history with meaningful comments. Use the blame view as often as you can. This is your archeological skills at work: Separating layers and layers of code all mingled in one place. A good SCM system can clear up a total mess for you and reveal the author’s intent for it. Without tests, issues and versioning, you cannot distinguish between a problem and a solution, accidental and deliberate complexity or a bug and a feature. Everything could mean something and even be crucial for the whole system or just be useless excess code (so-called “live weight” because the code will be executed, but with no effect in terms of features). To name an example, if you encounter a strange sleep() call (or multiple calls in a row), don’t eliminate or change them! The original author probably “fixed” a nasty bug with it that will come back sooner than you know it.

Walking on broken glass

And this is what you should do: Leave everything in its place, broken, awkward and clumsy, and try to separate your code from “their” code as much as possible. The rationale is to be able to differentiate between “their” mess and “your” mess and make progress on your part without breaking the already existing features. If you cannot wait any longer to clean up some of the existing code, make sure to release into production often and in a timely manner, so you still know what changed if something goes wrong. If possible, try to release two different kinds of new versions:

  • One kind of new version only incorporates refactorings to the existing code. If anything goes wrong or seems suspicious, you can easily bail out and revert to the previous version without losing functionality.
  • The other kind only contains new features, added with as little change to existing code as possible. Hopefully, this release will not break existing behaviour. If it does, you should double-check your assumptions about the code. If reasonably achievable, do not assume anything or at least write an automatic test to validate your assumption.

Personally, I call this approach the “tick-tock” release cycle, modelled after the release cycle of Intel for its CPUs.

Changing gears

A very important aspect of software development is to know when to “change gears” and switch from greenfield to brownfield or from development to maintainance mode. The text above describes the approach with inherited code, where the gear change is externally triggered by transferring the source code to a new team. But in reality, you need to apply most of the practices on your own source code, too. As soon as your system is in “production”, used in the wild and being filled with precious user data, it changes into maintainance mode. You cannot change existing aspects as easily as before.
In his book “Implementation Patterns” (2008), Kent Beck describes the development of frameworks among other topics. One statement is:

While in conventional development reducing complexity to a minimum is a valuable strategy for making the code easy to understand, in framework development it is often more cost-effective to add complexity in order to enhance the framework developer’s ability to improve the framework without breaking client code.
(Chapter 10, page 118)

I not only agree with this statement but think that it partly applies to “conventional development” in maintainance mode, too. Sometimes, the code needs additional complexity to cope with existing structures and data. This is the moment when you’ve inherited your own code.

Class names with verbs enforce the Single Responsibility Principle (SRP)

When using fluent code and fluent interfaces, I noticed an increased flexibility in the code. On closer inspection, this is the effect of a well-known principle that is inherently enforced by the coding style.

I’m experimenting with fluent code for a while now. Fluent code is code that everybody can read out loud and understand immediately. I’ve blogged on this topic already and it’s not big news, but I’ve just recently had a revelation why this particular style of programming works so well in terms of code design.

The basics

I don’t expect you to read all my old blog entries on fluent code or to know anything about fluent interfaces, so I’m giving you a little introduction.

Let’s assume that you want to find all invoice documents inside a given directory tree. A fluent line of code reads like this:


Iterable<Invoice> invoices = FindLetters.ofType(
    AllInvoices.ofYear("2012")).beneath(
        Directory.at("/data/documents"));

While this is very readable, it’s also a bit unusual for a programmer without prior exposure to this style. But if you are used to it, the style works wonders. Let’s see: the implementation of the FindLetters class looks like this (don’t mind all the generic stuff going on, concentrate on the methods!):

public final class FindLetters<L extends Letter> {
  private final LetterType<L> parser;

  private FindLetters(LetterType<L> type) {
    this.parser = type;
  }

  public static <L extends Letter> FindLetters<L> ofType(LetterType<L> type) {
    return new FindLetters<L>(type);
  }

  public Iterable<L> beneath(Directory directory) {
    ...
  }

Note: If you are familiar with fluent interfaces, then you will immediately notice that this isn’t even a full-fledged one. It’s more of a (class-level) factory method and a single instance method.

If you can get used to type in what you want to do as the class name first (and forget about constructors for a while), the code completion functionality of your IDE will guide you through the rest: The only public static method available in the FindLetters class is ofType(), which happens to return an instance of FindLetters, where again the only method available is the beneath() method. One thing leads to another and you’ll end up with exactly the Iterable of Invoices you wanted to find.

To assemble all parts in the example, you’ll need to know that Invoice is a subtype of Letter and AllInvoices is a subtype of LetterType<Invoice>.

The magical part

One thing that always surprised me when programming in this style is how everything seems to find its place in a natural manner. The different parts fit together really well, especially when the fluent line of code is written first. Of course, because you’ll design your classes to make everything fitting. And that’s when I had the revelation. In hindsight, it seems rather obvious to me (a common occurrence with revelations) and you’ve probably already seen it yourself.

The revelation

It struck me that all the pieces that you assemble a fluent line of code with are small and single-purposed (other descriptions would be “focussed”, “opinionated” or “determined”). Well, if you obey the Single Responsibility Principle (SRP), every class should only have one responsibility and therefore only limited purposes. But now I know how these two things are related: You can only cram so much purpose (and responsibility) in a class named FindLetters. When the class name contains the action (verb) and the subject (noun), the purpose is very much set. The only thing that can be adjusted is the context of the action on the subject, a task where fluent interfaces excel at. The main reason to use a fluent interface is to change distinct aspects of the context of an object without losing track of the object itself.

The conclusion

If the action+subject class names enforce the Single Responsibility Principle, then it’s no wonder that the resulting code is very flexible in terms of changing requirements. The flexibility isn’t a result of the fluency or the style itself (as I initially thought), but an effect predicted and caused by the SRP. Realizing that doesn’t invalidate the other positive effects of fluent code for me, but makes it a bit less magical. Which isn’t a bad thing.

Triggering jenkins from git with common post-receive hook

The standard way of triggering Jenkins jobs from a git repository was issuing a get request on the “build now” URL of the job in the post-receive hook, e.g.

curl http://my_ci_server:8080/job/my_job/build?delay=0sec

The biggest problem of this approach is that you have to hardcode the job name into the url. This prevents sharing the hook between repositories and requires you to put an adjusted post-receive hook script into each new repository. Also, additional work has to be done to trigger jobs only for certain branches and the like.
Fortunately, Jenkins offers a new way of triggering jobs from a git repository for quite a while now. Essentially you have to notify jenkins of the commit in your repository and configure the job for polling.

To trigger jobs for the repository git@my_repository_server:my_project.git you can use the following script:

GIT_REPO_URL=git@my_repository_server:`pwd | sed 's:.*\/::'`
curl http://my_ci_server:8080/git/notifyCommit?url=$GIT_REPO_URL

Notice the absence of any repository or job specific stuff in the post-receive hook. Such a hook can be placed in a central location and be shared between repositories using symbolic links.

RubyMotion: Ruby for iOS development

RubyMotion is a new (commercial) way to develop apps for iOS, this time with Ruby

RubyMotion is a new (commercial) way to develop apps for iOS, this time with Ruby. So why do I think this is better than the traditional way using ObjectveC or other alternatives?

Advantages to other alternatives

Other alternatives often use a wrapper or a different runtime. The problem is that you have to wait for the library/wrapper vendor to include new APIs when iOS gets a new update. RubyMotion instead has a static compiler which compiles to the same code as ObjectiveC. So you can use the myriads of ObjectiveC libraries or even the interface builder. You can even mix your RubyMotion code with existing ObjectiveC programs. Also the static compilation gives you the performance advantages of real native code so that you don’t suffer from the penalties of using another layer. So you could write your programs like you would in ObjectiveC with the same performance and using the same libraries, then why choose RubyMotion?

Advantages to the traditional way

First: Ruby. The Ruby language has a very nice foundation: everything is an expression. And everything can be evaluated with logic operators (only nil and false is false).
In ObjectiveC you would write:

  cell = tableView.dequeueReusableCellWithIdentifier(reuseId);
  if (!cell) {
    cell = [[TableViewCell alloc] initWithStyle: cellStyle, reuseIdentifier: reuseId]];
  }

whereas in Ruby you can write

cell = tableView.dequeueReusableCellWithIdentifier(@reuse_id)
  || TableViewCell.alloc.initWithStyle(@cell_style, reuseIdentifier:@reuse_id)

As you can see you can use the Cocoa APIs right away. But what excites me even more is the community which builds around RubyMotion. RubyMotion is only some months old but many libraries and even award winning apps have been written. Some libraries wrap so called boiler plate code and make it more pleasant you to use. Other introduce new metaphors which change the way apps are written entirely.
I see a bright future for RubyMotion. It won’t replace ObjectiveC for everyone but it is a great alternative.

A minimal set of skills for software development contractors

You aren’t sure if your developer is professional enough? Here are seven topics you can ask him about to find it out. It’s the minimal skill set a modern developer should use.

“Our company is specialized in providing professional software development for our customers”. That’s a nice statement to inspire your customers with. The only problem with it is: every contractor claims to be professional. You wouldn’t even get a project if you admitted to be “unprofessional”. But how can a customer, mostly unaware of the subtleties in the field of software development, decide if his contractor really works professionally? A lot of money currently spent on projects doomed from the beginning could be saved if the answer was that easy. But there’s a lower limit of skills that have to be present to pass the most minimal litmus test on developer professionality. This blog article gives you an overview about the things you should ask from your next software development contractor.

First a disclaimer: I’ve compiled this list of skills with the best intentions. It is definitely possible to develop software without some or even any of these skills. The development can even be performed in a very professional manner. So the absence of a skill doesn’t reveal an unprofessional contractor without fail. And on the other side, the clear presence of all skills doesn’t lead to glorious projects. The list is a rule of thumb to distinguish the “better” contractor from the “worse”. It’s a starting ground for the inexperienced customer to ask the right questions and get hopefully insightful answers.

Let’s assume you are a customer on the lookout for a suitable software development contractor, maybe a freelancer or a company. You might take this list and just ask your potential developer about every item on it. Listen to their answers and let them show you their implementation of the skill. In my opinion, the last point is the most crucial one: Don’t just talk about it, let them demonstrate their abilities. You won’t be able to differentiate the best from the most trivial implementation at first, but that’s part of the learning process. The thing is: if the developer can readily demonstrate something, chances are he really knows what he is talking about.

The minimal skills

The list is sorted by their direct impact on the overall development quality. This includes the quality perceived by you (the customer), the end user and the next developer who inherits the source code once the original developer bails out. This doesn’t mean that the topics mentioned later are “optional” in the long run.

Source code management system

This tool has many different names: source code management (SCM), revision control system (RCS) and version control system (VCS) are just a few of them. It is used to track the changes in the code over time. With this tool, the developer is able to tell you exactly which change happened when, for what version and by whom. It is even possible to undo the change later on. If your developer mentions specific tool names like Git, Subversion, Perforce or Mercurial, you are mostly settled here. Let him show you a typical sync-edit-commit cycle and try to comprehend what he’s telling you. Most developers love to brag about their sophisticated use of version control abilities.

Issue tracking

An issue or bug tracker is a tool that stores all inquiries, bug reports, wishes and complaints you make. You can compare it to a helpdesk “trouble ticket” system. The issue tracker provides a todo list for the developer and acts as an impartial documentation of your communication with the developer. If you can’t get direct access to the issue tracker on their website, let them demonstrate the usage by playing through a typical scenario like a bug report. At least, the developer should provide you with a list of “resolved” issues for each new version of your software.

Continuous integration

This is a relatively new type of tool, but a very powerful one. It can also be named a “build server” or (less powerful) a “nightly build”. The baseline is that your project will be built by an automated process, as often as possible. In the case of continuous integration, the build happens after each commit to the source code management system (refer to the first entry of this list). Let your developer show you what happens automatically after a commit to the source code management system. Ask him about the “build time” of your project (or other projects). This is the time needed to produce a new version you can try out. If the build time is reasonably low (like a few minutes), ask for a small change to your project and wait for the resulting software.

There is a fair chance that your developer not only talks about “continuous integration”, but also “continuous delivery”. This includes words like “staging”, “build queue”, “test installation”, etc. Great! Let them explain and demonstrate their implementation of “continuous delivery”. You’ll probably be impressed and the developer had another chance to brag.

Verification (a.k.a. Testing)

This is a delicate question: “Will the source code contain automated tests?”. Our industry’s expectancy value for any kind of automated tests in a project is still dangerously near absolute zero. If you get blank stares on that question, that’s not a good sign. It doesn’t really matter too much if the answer contains the words “unit test”, “integration test” or even “acceptance test”. Most important again: Let your developer show you their implementation of automated tests in your (or a similar) project. Make sure the continuous integration server (refer to entry number three) is aware of the tests and runs them on every build. This way, everything that’s secured by tests cannot break without being noticed immediately. You probably won’t have to deal with reappearing bugs in every other version, a symptom known as “regression”.

Your developer might be really enthusiastic about testing. While every developer hour costs your precious money, this is money well spent. Think of it as an insurance against unpredictable behaviour of your software in the future. Over the course of development, you won’t notice these tests directly, as they are used internally for development. Talk to your developer about some form of reporting on the tests. Perhaps a “test coverage” report that accompanies the issue list (refer to the second entry)? Just don’t go overboard here. A low test coverage percentage is still better than no tests.

If your developer states that he is “test driven”, that’s not a psychological condition, but a modern attempt to test really thoroughly. Let him demonstrate you the advantages of this approach by playing through an implementation cycle of a small change to your project. It may foster your confidence in the insurance’s power.

Project documentation

Every software project above the trivial level contains so many details that no human brain is able to remember them all after some time. Your developer needs some place to store vital information about the project other than “in the code” and “in the issue tracker”. A popular choice to implement this requirement is providing a Wiki. You probably already know a Wiki from Wikipedia. Think about a web-based text editing tool with structuring possibilities. If you can’t access the documentation tool yourself, let your developer demonstrate it. Ask about an excerpt of your project documentation, perhaps as a PDF or HTML document. Don’t be too picky about the aesthetics, the main use case is quick and easy information retrieval. Even handwritten project documentation may pass your test, as long as it is stored in one central place.

Source code conventions

Nearly all source code is readable by a machine. But some source code is totally illegible by fellow developers or even the original author. Ask your developer about their code formatting rules. Hopefully, he can provide you with some written rules that are really applied to the code. For most programming languages, there are tools that can check the formatting against certain rules. These programs are called “code inspection tools” and fit like hand in glove with the continuous integration server (refer to the third entry). Some aspects of source code readability cannot be checked by algorithms, like naming or clarity of concepts. Good developers perform regular code reviews where fellow developers discuss the code critically and suggest improvements. The best customers explicitely ask for code reviews, even if they won’t participate in them. You will feel the difference in the produced software on the long run.

Community awareness

Software development is a rapidly advancing profession, with game-changing discoveries every other year. One single developer cannot track all the new tools, concepts and possibilities in his field. He has to rely on a community of like-minded and well-meaning experts that share their knowledge. Ask your developer about his community. What (technical) books did he read recently? What books are known by the whole development team? As a customer, you probably can’t tell right away if the books are worth their paper, but that’s not the main point of the question. Just like with tests, the amount of books read by the average programmer won’t make a very long list. If your development team is consistent enough to share a common literature ground, that’s already worth a lot.

But it’s not just books. Even books are too slow for the advancement! Ask about participation in local technical events, like user groups of the programming language of your project. What about sharing? Does the developer share his experiences and insights? The cheapest way to do that is a weblog (you’re reading one right now). Let him show you his blog. How many articles are published in a reasonable timespan, what’s the feedback? Perhaps he writes articles for a technical magazine or even a book? Now you can ask other developers for their opinion on the published work. You’ve probably found a really professional developer, congratulations.

There is more, much more

This list is in no way exhaustive in regard to what a capable developer uses in concepts, skills and tools. This is meant as the minimal set, with a lot of room for improvement. There are compilations of skills like the Clean Code Developer that go way beyond this list. Ask your developer about his personal field of interest. Hopefully, after he finished bragging and techno-babbling for some time, you’re convinced that your developer is a professional one. You shouldn’t settle for less.

A tale of anti-virus software killing local connectivity

We are developing and running a distributed system which is deployed on-site at our client. Everything was running smoothly for years only some minor hick-ups related to network infrastructure problems occurred over time. Then one day our client told us the scheduled database backups were not working anymore. We immediately checked the database, all installed firewall programs and the like on that Windows 7 server machine. The Postgresql database was running and our local and remote application components were able to connect. Strangely though, neither pgAdmin nor psql or even telnet were able to make a connection locally to the database!! Adding more oddity we did not change or update any part of the system at the time things stopped working. Remote access to the database was working though leaving us even more confused. To sum up the situation:

  • Some applications can connect to the database locally, others cannot
  • Remote access to the database works without problems for all applications, even those that cannot connect locally on the server
  • We did not change any of these applications, neither client side nor server side
  • All firewalls were disabled and the problem persisted over reboots

The explanation

So we talked to our client again and depicted our complete analysis pinning the date of the breakage to a moment when we evidently did not change anything. Suddenly it struck him like lightning when he remembered that there was an automatic update of an anti-virus program. He removed the software from the machine and everything worked again as expected. Even reinstalling the anti-virus program did not break the system again. It was only this misbehaving automatic update somewhere in time that killed some part of our system in a most odd way…

Grails and the query cache

The principle of least astonishment can be violated in the unusual places like using the query cache on a Grails domain class.

Look at the following code:

class Node {
  Node parent
  String name
  Tree tree
}

Tree tree = new Tree()
Node root = new Node(name: 'Root', tree: tree)
root.save()
new Node(name: 'Child', parent: root, tree: tree).save()

What happens when I query all nodes by tree?

List allNodesOfTree = Node.findAllByTree(tree, [cache: true])

Of course you get 2 nodes, but what is the result of:

allNodesOfTree.contains(Node.get(rootId))

It should be true but it isn’t all the time. If you didn’t implement equals and hashCode you get an instance equals that is the same as ==.
Hibernate guarantees that you get the same instance out of a session for the same domain object. (Node.get(rootId) == Node.get(rootId))

But the query cache plays a crucial role here, it saves the ids of the result and calls Node.load(id). There is an important difference between Node.get and Node.load. Node.get always returns an instance of Node which is a real node not a proxy. For this it queries the session context and hits the database when necessary. Node.load on the other hand never hits the database. It returns a proxy and only when the session contains the domain object it returns a real domain object.

So allNodesOfTree returns

  • two proxies when no element is in the session
  • a proxy and a real object when you call Node.get(childId) beforehand
  • two real objects when you call get on both elements first

Deactivating the query cache globally or for this query only, returns two real objects.

A small story about outsourcing

A true story about why it isn’t always cheaper to produce more cost-effective. And a story about a process that wasn’t tailored around human requirements.

Let me tell you a story about human labor and automization, cost efficiency and the result of local optimization. The story itself is true, but nearly all details are changed to protect the innocent.

An opportunity

Once, there was a company that produced sensoric equipment with a large portion of electronic circuitry. The whole device was manufactured at the company’s main factory and admired for its outstanding rigidity. Then, one day, the opportunity offered itself to outsource the assembly of the electronics to a country in the asian region. The company boss immediately recognized the business value in this change: The same parts would be produced by the company, shipped to the asian contract manufacturerer, assembled and promptly returned. Then, the company’s engineers will provide the firmware and software for the final product. By outsourcing the most generic step in the production line, the production costs could be lowered significantly.

A detail

There is one little detail that needs to be told: The sensors relied on some very specific and fragile parts only the company itself could produce. These parts were especially sensible to the atmosphere they were assembled in. A very important aspect of the production process was a special purpose machine that could assemble the parts while sustaining the necessary gas mixture and pressure. Upon closer inspection, one could say that the essence of this product’s secret ingredience weren’t the parts itself, but the specifically tailored production process.

The special purpose machine had to be transferred to the contract manufacturer in asia, otherwise, the sensors could not be assembled. This was a minor inconvenience compared to the large profits that could be realized once the outsourcing was completed.

A success

The machine was transported to the contractor, installed and tested. A special crew of workers of the contractor’s staff was trained to operate the machine properly and within the necessary conditions. After a while, the production line began its work. The first sensors assembled offshore returned home. They all worked as intended. The local engineers couldn’t tell the difference but by looking at the serial number. The company management was pleased, the profitability was increased.

A failure

Everything went well for a while. Then, the local engineers noticed a slightly higher number of faulty sensors. Not long after, the quality assurance reported decreasing performance numbers of the devices. The rigidity of the device, the unique selling point, slowly deteriorated. The company management was worried and established a task force to indentify the root cause for this change to the worse.

A mystery

The task force inspected the reported problems and couldn’t make much sense of the numbers. It wasn’t a problem of whole faulty batches (indicating incidents like transport damage), but also not of individual faulty pieces. Instead, they found that if a piece was faulty, the next few pieces from that series were also faulty. Then, there were long intervals with perfectly good pieces until another group of clearly faulty pieces occurred. Something had to go wrong during the assembly process at the contractor.

A revelation

When the task force arrived in the contractor’s factory and inspected the special purpose machine, they found that the atmosphere regulator was damaged. This automatic part of the machine takes care of the mixture and pressure of the gas in the machine during operation and keeps it in the necessary range by applying or draining specific gas. The contractor didn’t bother to replace the rather expensive part when cheap human labor is readily available. They had hired a worker to perform the atmosphere regulation manually. Some lowly paid worker had to watch the pressure numbers and provide more or less gas, just as needed. This was nearly as good as the automatic regulation and still good enough to produce quality devices.

An explanation

But, the contractor only hired one worker per shift. This worker had to go to the toilet sometimes during the work day. When he was away from the machine, it went along unregulated, soon to be misadjusted to the point of only producing junk. Once the worker returned, he would balance the numbers and bring the machine in the OK state again. This situation occurred periodically, but not too often to taint whole batches. Only during his absence, the series of faulty devices would be produced.

A conclusion

I don’t want to add much moral to this story. Perhaps one thing should be considered when recapitulating: Both the company and the contractor “optimized” their costs locally by making cost efficient decisions that turned out to be expensive in the long run. The company chose between expensive, but controlled local production and cheap outsourced assembly, arguably the most delicate step in the whole production process. The contractor chose between a high one-time investment in an automatism and the low ongoing cost of cheap human labor. Both decisions are comprehensible on their own, but lead to a situation that would never have occurred in the original setting.

Testing C programs using GLib

Writing programs in good old C can be quite refreshing if you use some modern utility library like GLib. It offers a comprehensive set of tools you expect from a modern programming environment like collections, logging, plugin support, thread abstractions, string and date utilities, different parsers, i18n and a lot more. One essential part, especially for agile teams, is onboard too: the unit test framework gtest.

Because of the statically compiled nature of C testing involves a bit more work than in Java or modern scripting environments. Usually you have to perform these steps:

  1. Write a main program for running the tests. Here you initialize the framework, register the test functions and execute the tests. You may want to build different test programs for larger projects.
  2. Add the test executable to your build system, so that you can compile, link and run it automatically.
  3. Execute the gtester test runner to generate the test results and eventually a XML-file to you in your continuous integration (CI) infrastructure. You may need to convert the XML ouput if you are using Jenkins for example.

A basic test looks quite simple, see the code below:

#include <glib.h>
#include "computations.h"

void computationTest(void)
{
    g_assert_cmpint(1234, ==, compute(1, 1));
}

int main(int argc, char** argv)
{
    g_test_init(&argc, &argv, NULL);
    g_test_add_func("/package_name/unit", computationTest);
    return g_test_run();
}

To run the test and produce the xml-output you simply execute the test runner gtester like so:

gtester build_dir/computation_tests --keep-going -o=testresults.xml

GTester unfortunately produces a result file which is incompatible with Jenkins’ test result reporting. Fortunately R. Tyler Croy has put together an XSL script that you can use to convert the results using

xsltproc -o junit-testresults.xml tools/gtester.xsl testresults.xml

That way you get relatively easy to use unit tests working on your code and nice some CI integration for your modern C language projects.

Update:

Recent gtester run the test binary multiple times if there are failing tests. To get a report of all (passing and failing) tests you may want to use my modified gtester.xsl script.

Testing antipatterns

Some testing anti patterns found in everyday code.

Catch all

try {
  callFailingMethod()
  fail()
} catch (Exception e) {
}

Problems:
When you look at the test code you cannot see which type of exception is thrown. First it is better for clarity to document which type is thrown and second any bugs in the called code who throw unintended exceptions are swallowed here.

Better:

try {
  callFailingMethod()
  fail()
} catch (ParseException e) {
}

Problems:
If it fails you don’t see why: so always use a message for fail.

Better:

try {
  callFailingMethod()
  fail('ParseException expected')
} catch (ParseException e) {
}

Problems:
If an exception is thrown, you don’t assert that it is the expected exception, so test for the exception message.

Solution:

try {
  callFailingMethod()
  fail('ParseException expected')
} catch (ParseException e) {
  assertEquals("Invalid character at line 2", e.getMessage())
}

Using assert

assert isOdd(3)

Problems:
If you do not enable assertions on the JVM (by passing -ea) this line does nothing and the test passes fine every time.

Better:

assertTrue(isOdd(3))

Problems:
If assertTrue or assertFalse fails, you just get a generic error message, better use a message which communicates the error/

Solution:

assertTrue("3 should be odd", isOdd(3))

AssertTrue instead of assertEquals

  assertTrue('Expected: 1+2 = 3', sum(1, 2) == 3)

Problems:
You don’t see the actual value here, you could include it in the message, but there is an assertion for that: assertEquals

Solution:

  assertEquals(3, sum(1, 2))

Conditional logic in tests

if (isOdd(value)) {
  assertEquals(5, calculate(value)) 
} else {
  assertEquals(6, calculate(value)) 
}

Problems:
Can you look at the test source code and tell me which branch is used? If only one is used all the time, erase the other. If both are used, first make the test deterministic and use two tests, one for each branch.