Think of your code as a maintenance minefield

Most of the cost, effort and time of a software project is spent on the maintenance phase, the modification of a software product after delivery. If you think about all these resources as “negative investments” or debt settlement and try to associate your spendings with specific code areas or even single lines of code, you’ll probably find that the maintenance cost per line is not equally distributed. There are lots of lines of code that outlast the test of time without any maintenance work at all, a fair amount of lines that require moderate attention and some lines that seem to require constant and excessive developer care.

If you transfer this image to another metaphor, your code presents itself like a minefield for maintenance effort: Most of the area is harmless and safe to travel. But there are some positions that will just blow up once touched. The difference is that as a software developer, you don’t tread on the minefield, but you catch the flak if something happens.

You should try to deliver your code free of maintenance mines.

Spotting a maintenance mine

Identifying a line of code as a maintenance mine after the fact is easy. You probably already recognize the familiar code as “troublesome” because you’ve spent hours trying to understand and fix it. The commit history of your version control system can show you the “hottest” lines in your code – the areas that were modified most often. If you add tests for each new bug, you’ll find that the code is probably tested really well, with tests motivated by different bug issues. In hindsight, you can clearly distinguish low-effort code from high maintenance code.

But before delivery, all code looks the same. Or does it?

An example of a maintenance mine

Let’s look at an example. Our system monitors critical business data and sends out alerts if certain conditions are met. One implementation of the part sending the alerts is a simple e-mail sender. The code is given here:


public class SendEmailService {

  public void sendTo(
                Person person,
                String subject,
                String body) {
    execCmd(
         buildCmd(
               person.email(), subject, body));
  }

  private String buildCmd(String recipientMailAdress, String subject, String body){
    return "'/usr/bin/mutt -t " + recipientMailAdress + " -u " + subject + " -m " + body + "'";
  }

  private int execCmd(String command) throws IOException{
    return Runtime.getRuntime()
                  .exec(command).exitValue();
  }
}

This code has two interesting problems:

  • The first problem is that it is written in Java, a platform agnostic programming language, but depends on being run on a linux (or sufficiently similar unixoid) operating system. The system it runs on needs to supply the /usr/bin/mutt program and have the e-mail sending settings properly configured or else every try to run the send command will result in an error. This implicit dependency on the configuration of the production system isn’t the best way to deal with the situation, but it’s probably a one-time pain. The problem clearly presents itself and once the system is set up in the right way, it is gone (until somebody tampers with the settings again). And my impression is that this code separates two concerns between development and operations rather nicely: Development provides software that can send specific e-mails if operations provides a system that is capable of sending e-mails. No need to configure the system for e-mail sending and doing it again for the software on said system.
  • The second problem looks like a maintenance mine. In the line where the code passes the command line to the operating system (by calling Runtime.getRuntime().exec()), a Process object is returned that is only asked for its exitValue(), implicating a wait for the termination of the system command. The line looks straight and to the point. No need to store and handle intermediate objects if you aren’t interested in them. But perhaps, you should care:

By default, the created process does not have its own terminal or console. All its standard I/O (i.e. stdin, stdout, stderr) operations will be redirected to the parent process, where they can be accessed via the streams obtained using the methods getOutputStream(), getInputStream(), and getErrorStream(). The parent process uses these streams to feed input to and get output from the process. Because some native platforms only provide limited buffer size for standard input and output streams, failure to promptly write the input stream or read the output stream of the process may cause the process to block, or even deadlock.

Emphasize by me, see also: https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/lang/Process.html

This means that the Process object’s stdout and stderr outputs are stored in buffers of unknown (and system dependent) size. If one of these buffers fills up, the execution of the command just stops, as if somebody had paused it indefinitely. So, depending on your call’s talkativeness, your alert e-mail will not be sent, your system will appear to have failed to recognize the condition and you’ll never see a stacktrace or error exit value. All other e-mails (with less chatter) will go through just fine. This is a guaranteed source of frantic telephone calls, headaches and lost trust in your system and your ability to resolve issues.

And all the problems originate from one line of code. This is a maintenance mine with a stdout fuse.

The fix for this line might lie in the use of the ProcessBuilder class or your own utility code to drain the buffers. But how would you discover the mine before you deliver it?

Mines often lie at borders

One thing that stands out in this line of code is that it passes control to the “outside”. It acts as a transit point to the underlying operating system and therefor has a lot of baggage to check. There are no safety checks implemented, so the transit must be regarded as unsafe. If you look out for transit points in your code (like passing control to the file system, the network, a database or another external system), make sure you’ve read the instructions and requirements thoroughly. The problems of a maintenance mine aren’t apparent in your code and only manifest themselves during the interaction with the external system. And this is a situation that happens disproportionately often in production and comparably seldom during development.

So, think of your code as a maintenance minefield and be careful around its borders.

What is your minesweeper story? Drop us a comment.

Easy maintenance, not easy production

It is said that source code gets read a hundred times more often than it gets written. My experience confirms this circumstance, which leads to a principle of economic source code modifications: The first modification is almost for free, it’s the later ones that run up a bill. In order to achieve a low TCO (total cost of ownership), it’s sound advice to plan (and develop) for easy maintenance instead of easy production. This principle even has a fancy name: Keep It Simple, Stupid (KISS).

Origin of KISS

According to the english Wikipedia on the topic, the principle’s name was coined by an aircraft engineer named Kelly Johnson, who also designed the fastest jet plane of all time, the SR-71 “Blackbird”. The aircraft reached speeds of over Mach 3 and had an unmatched defensive armament: enough thrust to evade any confrontation. It would simply fly higher and/or faster than anything launched against it, like interceptor fighters or anti-air missiles. The Blackbird used construction material that was specifically invented for this plane and very expensive, so it definitely was no easy production. Sadly for this blog post, it wasn’t particularly easy to maintain, either. It usually leaked so much fuel that it had to be refueled directly after takeoff. The leaks were pragmatically designed that way and would seal themselves in flight. It’s quite an interesting plane.

Easy maintenance

Johnson once alledegly gave his engineers a bunch of tools and required that the aircraft under design must be repairable with only these tools, by an average mechanic in the field under combat conditions (e.g. stressed, exhausted and narrow timeframe). This is a wonderful concept that I regularly apply to my projects: Imagine that you are on-site with your customer, the most important functionality of your project just broke, resulting in a disastrous standstill of the whole facility and all you have is your sourcecode and the vi editor (or vim, if you’re non-hardcore like me). No internet, no IDE, no extensive documentation. Could you make meaningful changes to your project under these conditions? What needs to be changed to make your life easier in such an extreme situation? What restrictions would these changes impose on your daily work? How much effort/damage/resources would these changes cost? Is it easier to anticipate maintenance or to trust to luck that it won’t be necessary?

Easy production

A while ago, I reviewed the code of a map tool. The user viewed a map and could click on it to mark a certain location. The geo coordinates of this location would be used as an input for further computations. The map was restricted to a fixed area, so the developer wrote the code in the easiest way possible: He chose well-known geographic landmarks on the map and determined their geo coordinates and pixel locations. map-with-referenceThose were the reference points in the code that every click would be related to. Using easy mathematic (rule of three), each point on the map could be calculated. Clever trick and totally working! The code practically wrote itself and the reference points only needed to be determined once.

Until the map was changed. The code contained some obscure constants that described the original landmarks, but the whole concept of arbitrary reference points was alien to the next developer. He was used to the classic concept of two reference points: top left and bottom right, with their respective geo coordinates and pixel locations. What seemed like a quick task turned into a stressful reclaiming of the clever trick during production.

In this example, the production (initial development) was easy, straight-forward and natural, but not easily reproducible during the maintenance phase (subsequent modification). The algorithm used a clever approach, but this cleverness isn’t necessarily available “under combat conditions”.

Go the extra mile

Most machines are designed so that wearing parts can be easily replaced. Think of batteries in electronic gagdets (well, at least before the gadget’s estimated lifetime was lower than the average battery life) or light bulbs in cars (well, at least before LED headlights were considered cool). Thing is, engineers usually know the wear effects their designs have to endure. There is no “usual wear effect” on software, due to lack of natural forces like gravitation. Everything could change, so it’s better to be prepared for all sorts of change. That’s the theory, but it’s not economically sound to develop software that is prepared for any circumstance. Pragmatic reasoning calls for compromise, like supporting changes that

  • are likely to happen (this needs to be grounded in domain knowledge and should be documented in the code)
  • are not expensive to arrange beforehands (the popular “low-hanging fruits”)
  • are expensive to implement afterwards

The last aspect might be a bit of a surprise, but it’s the exact aspect that came to play in the example above: To recreate the knowledge about the clever trick of landmark choices needed more time than implementing the classic interpolation taking the edge points.

So if you see a possible (and not totally unlikely) change that will be expensive to implement once the intimate knowledge of the code you have during the initial development, go the extra mile and prepare for it right now. Leave a comment, implement some kind of extension mechanism or just mind the code seams. In our example, slightly complicating the initial development led to a dramatically less clever and more accessible code.

Conclusion

You should consider writing your code for easy maintenance, even if that means additional effort during the initial implementation. Imagine that you yourself have to do the future change, stressed, over-worked and under time pressure, without any recollection about your thoughts today, lacking your familiar tools while your customer waits impatiently by your side. With proper preparation, even this scenario is feasible.