Easy maintenance, not easy production

Imagine that you have to do critical changes in the source code, stressed, over-worked and under time pressure, without any recollection about your thoughts during development, lacking your familiar tools while your customer waits impatiently by your side. That’s when you are glad you prepared for this moment.

It is said that source code gets read a hundred times more often than it gets written. My experience confirms this circumstance, which leads to a principle of economic source code modifications: The first modification is almost for free, it’s the later ones that run up a bill. In order to achieve a low TCO (total cost of ownership), it’s sound advice to plan (and develop) for easy maintenance instead of easy production. This principle even has a fancy name: Keep It Simple, Stupid (KISS).

Origin of KISS

According to the english Wikipedia on the topic, the principle’s name was coined by an aircraft engineer named Kelly Johnson, who also designed the fastest jet plane of all time, the SR-71 “Blackbird”. The aircraft reached speeds of over Mach 3 and had an unmatched defensive armament: enough thrust to evade any confrontation. It would simply fly higher and/or faster than anything launched against it, like interceptor fighters or anti-air missiles. The Blackbird used construction material that was specifically invented for this plane and very expensive, so it definitely was no easy production. Sadly for this blog post, it wasn’t particularly easy to maintain, either. It usually leaked so much fuel that it had to be refueled directly after takeoff. The leaks were pragmatically designed that way and would seal themselves in flight. It’s quite an interesting plane.

Easy maintenance

Johnson once alledegly gave his engineers a bunch of tools and required that the aircraft under design must be repairable with only these tools, by an average mechanic in the field under combat conditions (e.g. stressed, exhausted and narrow timeframe). This is a wonderful concept that I regularly apply to my projects: Imagine that you are on-site with your customer, the most important functionality of your project just broke, resulting in a disastrous standstill of the whole facility and all you have is your sourcecode and the vi editor (or vim, if you’re non-hardcore like me). No internet, no IDE, no extensive documentation. Could you make meaningful changes to your project under these conditions? What needs to be changed to make your life easier in such an extreme situation? What restrictions would these changes impose on your daily work? How much effort/damage/resources would these changes cost? Is it easier to anticipate maintenance or to trust to luck that it won’t be necessary?

Easy production

A while ago, I reviewed the code of a map tool. The user viewed a map and could click on it to mark a certain location. The geo coordinates of this location would be used as an input for further computations. The map was restricted to a fixed area, so the developer wrote the code in the easiest way possible: He chose well-known geographic landmarks on the map and determined their geo coordinates and pixel locations. map-with-referenceThose were the reference points in the code that every click would be related to. Using easy mathematic (rule of three), each point on the map could be calculated. Clever trick and totally working! The code practically wrote itself and the reference points only needed to be determined once.

Until the map was changed. The code contained some obscure constants that described the original landmarks, but the whole concept of arbitrary reference points was alien to the next developer. He was used to the classic concept of two reference points: top left and bottom right, with their respective geo coordinates and pixel locations. What seemed like a quick task turned into a stressful reclaiming of the clever trick during production.

In this example, the production (initial development) was easy, straight-forward and natural, but not easily reproducible during the maintenance phase (subsequent modification). The algorithm used a clever approach, but this cleverness isn’t necessarily available “under combat conditions”.

Go the extra mile

Most machines are designed so that wearing parts can be easily replaced. Think of batteries in electronic gagdets (well, at least before the gadget’s estimated lifetime was lower than the average battery life) or light bulbs in cars (well, at least before LED headlights were considered cool). Thing is, engineers usually know the wear effects their designs have to endure. There is no “usual wear effect” on software, due to lack of natural forces like gravitation. Everything could change, so it’s better to be prepared for all sorts of change. That’s the theory, but it’s not economically sound to develop software that is prepared for any circumstance. Pragmatic reasoning calls for compromise, like supporting changes that

  • are likely to happen (this needs to be grounded in domain knowledge and should be documented in the code)
  • are not expensive to arrange beforehands (the popular “low-hanging fruits”)
  • are expensive to implement afterwards

The last aspect might be a bit of a surprise, but it’s the exact aspect that came to play in the example above: To recreate the knowledge about the clever trick of landmark choices needed more time than implementing the classic interpolation taking the edge points.

So if you see a possible (and not totally unlikely) change that will be expensive to implement once the intimate knowledge of the code you have during the initial development, go the extra mile and prepare for it right now. Leave a comment, implement some kind of extension mechanism or just mind the code seams. In our example, slightly complicating the initial development led to a dramatically less clever and more accessible code.

Conclusion

You should consider writing your code for easy maintenance, even if that means additional effort during the initial implementation. Imagine that you yourself have to do the future change, stressed, over-worked and under time pressure, without any recollection about your thoughts today, lacking your familiar tools while your customer waits impatiently by your side. With proper preparation, even this scenario is feasible.

 

 

Keeping it simple is hard indeed

A true story I was reminded of when I read Chad LaVigne’s essay on simplicity from the book “97 Things Every Software Architect Should Know”.

When I read the great book (or collection of essays) “97 Things Every Software Architect Should Know”, there was a story I had experienced first-hand for nearly every chapter. But it was only after I read about it in the well-placed words of somebody with greater in-depth experience than me that it appeared clear to me. The essence of my own experience was written down so I could iterate over it. Here is a story about simplicity that I hope will help somebody out there iterate over the same thought sometimes.

The problem

In the early days, we were asked to provide custom software for a nearly robot-like machine that performs measurements. Our software had to control some engines that could move sensors and actuators around. The whole machine was built by another company that only had eyes for the mechanical and electrical aspects. As a result, the communication protocol between the microcontrollers of the machine and our software was horribly awkward and unpleasant in regard of the software engineering side of the project.

One engine was the so-called “vertical engine”, because it could move an array of sensors up and down. There were four actual positions where the engine would stop automatically upon contact. It was the job of our software to send the correct commands in the correct order at the correct time to reach the destination. Effectively, we would “hop” from position to position, either moving the sensors up or down. If only the communication protocol would’ve allowed that. What we got was two commands: moving one position down and moving up to the topmost position. The movement options are depicted here:

The first solution

But what would a developer’s life be without a few challenges? Our team quickly came up with several possible solutions, from finite state machines to object graphs of “position” instances that offered several “movement options” to an agent-like object travelling inside the graph. In the brainstorming session, even more sophisticated possibilities were discussed. The solution we finally implemented was complex in terms of clarity. We needed quite a few unit and integration tests to proof the whole thing correct. During all this process, nobody ever questioned the complexity of the problem.

The essay

The chapter I read in the book “97 Things Every Software Architect Should Know” when this story came back to me was “Make Sure the Simple Stuff Is Simple” written by Chad LaVigne. Let me cite some sentences to show the message of his essay:

“People who design software are smart – really smart. The simple problem-complex solution trap can be an easy one to fall into because we like to demonstrate our knowledge. If you find yourself designing a solution so clever that it may become self-aware, stop and think. Does the solution fit the problem?”

The essay hit the nail on the head for me. Our solutions for this “vertical engine” were all clever and solved the problem, but a much more complex problem than the one that was really needed.

The second solution

Our first version of the software went into production and worked a treat. Years later, the hardware (specifically the microcontrollers) fell apart and was replaced by more sophisticated electronics. We needed to rewrite parts of our software, especially the engine controls. But this time, we had real movement patterns of all engines, collected in the logfiles. Some regular expression magic later (remember, we are smart people and like to show that!), it was clear that the vertical engine had never used two of the possible movement options. The real usage pattern of the engine is depicted in the picture below:

The real problem we should have solved from the beginning is much simpler than the original one. Every engine position has only one successor, there is no need to choose from several options or to calculate the shortest path to the destination. It’s the most boring finite state machine you’ve ever seen and there is no need for recursive structures.

It was our own cleverness that made the problem appear more complex than it really was, just because we could handle it. Some investigation on the problem domain instead of the solution domain would have brought us to the same conclusion that the log file analysis finally revealed.

The lesson learnt

The whole story isn’t about failure. Both solutions worked for the customer. In fact, he never noticed a difference when the second solution was going live. This story is about accidental complexity and about being smarter than the requirement. It’s in our nature to accept a problem as “given” and come up with an elegant solution. It might be much more effective to question the problem instead.

Thank you, Chad, for your great essay!