Separation of Intent and Implementation

Last week, my colleague wrote about building blocks and how to achieve a higher-level language in your code by using them. Instead of talking about strings and files, you change your terms to things like coordinates and resources.

I want to elaborate on one aspect of this improvement, that aims at the separation of your intent from the current implementation. Your intent is what you want to achieve with the code you write. The current implementation is how you achieve it right now. I point out the transience of the implementation so clearly because it is most likely the first (and maybe even only) thing to change.

I have an example of this concept that is hopefully understandable enough. Let’s say that you build a system that gathers a lot of environmental data and stores it for later analysis and introspection. Don’t worry, the data consists mostly of things like air pressure, temperature and radioactive load. Totally harmless stuff – unless you find the wrong isotopes. In that case, you want to have a closer look and understand the situation. Most temporarily increased radioactivity in the air is caused by a normal thunderstorm. Most temporarily decreased radioactivity in the air is caused by a normal rain.

Storing all the data requires something like an archive. We want to store the data separated by the point of measurement (a “station”), the type of data (let’s call it “data entry type” because we aren’t very creative with names here) and by the exact point in time the measurement took place. To make matters a little bit more complicated, we might have more than one device in a station that captures a specific data entry type. Think about two thermometers on both sides of the station to make local heatup effects visible.

In order to reference a definite entry in our archive, we need a value for four aspects or dimensions :

  • The station
  • The data entry type
  • The device
  • The date and time

Thinking from the computer

If you implement your archive in the file system, you can probably see the directory structure right before you:

And in each directory for the day, we have a file for each hour:

So we can just write a class that takes our four parameters and returns the corresponding file. That is a straighforward and correct implementation.

It is also one of the implementations that couples your intent (an archive with four dimensions of navigability) nearly inseparably with your decisions on how to use your computer’s basic resources.

Thinking from the algorithms point of view

In order to separate your intent from your current implementation, you need to specify your intent as unencumbered from details as possible. Let’s specify our 4-axis archive nagivation system as a coordinate:

public record ArchiveCoordinate(
    StationId station,
    DataEntryType type,
    DeviceId device,
    LocalDateTime measurementTime
) {
}

There is nothing in here that point towards file system things like directories or files. We might have a hunch of the actual hierarchy by looking at the order of the parameters, but it is easy to implement a hierarchy-free nagivation between coordinates:

public record ArchiveCoordinate([...]) {
    public ArchiveCoordinate withStationChangedTo(
        StationId newStation
    ) {
        [...]
    }
    
    public ArchiveCoordinate withTypeChangedTo(
        DataEntryType newType
    ) {
        [...]
    }
    
    public ArchiveCoordinate withDeviceChangedTo(
        DeviceId newDevice
    ) {
        [...]
    }
    
    public ArchiveCoordinate withMeasurementTimeChangedTo(
        LocalDateTime newMeasurementTime
    ) {
        [...]
    }
}

The concept is that if you know one coordinate, you can navigate relative to it through the archive, without knowingly changing the directory or whatever implementation structure lies beneath your model layer. Let’s say we have the coordinate of a particular measurement of one thermometer. How do we get the same measurement of the other thermometer?

ArchiveCoordinate measurementForThermometer0 = new ArchiveCoordinate([...]);
ArchiveCoordinate measurementForThermometer1 = measurementForThermometer0.withDeviceChangedTo(thermometer1);

We can provide methods that allow us to step forward and backward in time. We can provide our application code with everything it requires to implement clear and concise algorithms based on our model of the archive.

But there will be the moment where you want to “get real” and access the data. You might decide to let your current implementation shine through to your intent layer and provide an actual file:

public interface Archive {
    Optional<File> entryFor(ArchiveCoordinate coordinate);
}

That’s all you need from the archive to get your file. But you might also decide to prolong your intent layer and wrap the file in your own data type that provides everything your algorithms need without revealing that it is really a file that lies underneath:

public interface Archive {
    Optional<ArchiveResource> entryFor(ArchiveCoordinate coordinate);
}

The new ArchiveResource is a thin, but effective veneer (some might call it a wrapper or a facade) that gives us the required information:

public interface ArchiveResource {
    String name();
    long size();
    InputStream read();
}

Of course, we need to provide an implementation for all of this. But by staying vague in the intent layer, we open the door for an implementation that has nothing to do with files. Instead of a file system, there could be a relational database underneath and we wouldn’t notice. Our algorithms would still work the same way and read their data from ArchiveResources that aren’t FileArchiveResources anymore, but DatabaseArchiveResources.

You can probably imagine how you can provide the intent for data writing using the example above. If not, let me show you the necessary additions:

public interface Archive {
    Optional<ArchiveResource> entryFor(ArchiveCoordinate coordinate);
    ArchiveResource createEntryFor(ArchiveCoordinate coordinate) throws IOException;
}
public interface ArchiveResource {
    String name();
    long size();
    InputStream read();
    OutputStream write();
}

Now you can store additional data to the archive without ever knowing if you write to a file or a database or something completely different.

Summary

By separating your intent from your current actual implementation, you gain at least three things for the cost of more work and some harder thinking:

  1. Your algorithms only use your intent layer. You design it exclusively for your algorithms. It will fit like a glove.
  2. The terms you use in your intent layer shape the algorithm metaphors way better than the terms of your current implementation. You can freely decide what terms you’ll use.
  3. The algorithms and your intent layer are designed to last. Your current implementation can be swapped out without them noticing.

If this sounds familiar to you, it is a slightly different take on the “ports and adapters” architecture. The important thing is that by starting with the intent and naming it from the standpoint of your algorithms (application code), you are less prone to let your implementation shine through.

Spicing up the Game of Life Kata – Part I

Conway’s Game of Life is a worthwhile coding kata that I’ve implemented probably hundreds of times. It is compact enough to be completed in 45 minutes, complex enough to benefit from Test First or Test Driven Development and still maintains a low entry barrier so that you can implement it in a foreign programming language without much of a struggle (except if the foreign language is APL).

And despite appearing to be a simple 0-player game with just a few rules, it can yield to deep theory, as John Conway explains nicely in this video. Oh, and it is turing complete, so you can replicate a Game of Life in Game of Life – of course.

But after a few dozen iterations on the kata, I decided to introduce some extra aspects to the challenge – with sometimes surprising results. This blog series talks about the additional requirements and what I learnt from them.

Additional requirement #1: Add color to the game

The low effort user interface of the Game of Life is a character-based console output of the game field for each generation. It is sufficient to prove that the game runs correctly and to watch some of the more advanced patterns form and evolve. But it is rather unpleasing to the human eye.

What if each living cell in the game is not only alive, but also has a color? The first generation on the game field will be very gaudy, but maybe we can think about “color inheritance” and have the reproducing cells define the color of their children. In theory, this should create areas of different colors that can be tracked back to a few or even just one ancestor.

Let’s think about it for a moment: When all parent cells are red, the child should be red, too. If a parent is yellow and another one is red, the child should have a color “on the spectrum” between yellow and red.

Learning about inheritance rules

One specific problem of reproduction in the Game of Life is that we don’t have two parents, we always have three of them:

Any dead cell with exactly three live neighbors becomes a live cell, as if by reproduction.

Rule #4 of Game of Life

We need to think about a color inheritance rule that incorporates three source colors and produces a target color that is somehow related to all three of them:

f(c1, c2, c3) → cn

A non-harebrained implementation of the function f is surprisingly difficult to come up with if you stay within your comfort zone regarding the representation of colors in programming languages. Typically, we represent colors in the RGB schema, with a number for the three color ingredients: red, green and blue. If the numbers range from zero to one (using floating-point values) or from zero to 255 (using integer values) or even some other value range doesn’t really matter here. Implementing the color inheritance function using RGB colors adds so many intricacies to the original problem that I consider this approach a mistake.

Learning about color representations

When we search around for alternative color representations, the “hue, saturation and brightness” or HSB approach might capture your interest. The interesting part is the first parameter: hue. It is a value between 0 and 360, with 0 and 360 being identically and meaning “red”. 360 is also the number of degrees in a full circle, so this color representation effectively describes a “color wheel” with one number.

This means that for our color inheritance function, the parameters c1, c2 and c3 are degrees beween 0 and 360. The whole input might look like this:

Just by looking at the graphics, you can probably already see the color spectrum that is suitable for the function’s result. Instead of complicated color calculations, we pick an angle somewhere between two angles (with the third angle defining the direction).

And this means that we have transformed our color calculation into a geometric formula using angles. We can now calculate the span between the “leftmost” and the “rightmost” angle that covers the “middle” angle. We determine a random angle in this span and use it as the color of the new cell.

Learning about implicit coupling

But there are three possibilities to calculate the span! Depending on what angle you assign the “middle” role, there are three spans that you can choose from. If you just take your parent cells in the order that is given by your data structure, you implement your algorithm in a manner that is implicitly coupled to your technical representation. Once you change the data structure ever so slightly (for example by updating your framework version), it might produce a different result regarding the colors for the exact same initial position of the game. That is a typical effect for hardware-tied software, as the first computer games were, but also a sign of poor abstraction and planning. If you are interested in exploring the effects of hardware implications, the game TIS-100 might be for you.

We want our implementation to be less coupled to hardware or data structures, so we define that we use the smallest span for our color calculation. That means that our available colors will rapidly drift towards a uniform color for every given isolated population on our game field.

Learning about long-term effects (drifts)

But that is not our only concern regarding drifts. Even if you calculate your color span correctly, you can still mess up the actual color pick without noticing it. The best indicator of this long-term effect is when every game you run ends in the green/cyan/blue-ish region of the color wheel (the 50 % area). This probably means that you didn’t implement the equivalence of 0° and 360° correctly. Or, in other words, that your color wheel isn’t a wheel, but a value range from 0 to 360, but without wrap-around:

You can easily write a test case that takes the wrap-around into account.

But there are other drifts that might effect your color outcomes and those are not as easily testable. One source of drift might be your random number generator. Every time you pick a random angle from your span, any small tendency of the random number generator influences your long-term results. I really don’t know how to test against these effects.

A more specific source of drift is your usage of the span (or interval). Is it closed (including the endpoints) or open (not including the endpoints)? Both options are possible and don’t introduce drift. But what if the interval is half-open? The most common mistake is to make it left-closed and right-open. This makes your colors drift “counter-clockwise”, but because you wrapped them correctly, you don’t notice from looking at the colors only.

I like to think about possible test cases and test strategies that uncover those mistakes. One “fun” approach is the “extreme values random number generator” that only returns the lowest or highest possible number.

Conclusion

Adding just one additional concept to a given coding kata opens up a multitude of questions and learnings. If you add inheritable colorization to your Game of Life, it not only looks cooler, but it also teaches you about how a problem can be solved easier with a suitable representation, given that you look out for typical pitfalls in the implementation.

Writing (unit) test cases for those pitfalls is one of my current kata training areas. Maybe you have an idea how to test against drifts? Write a comment or even a full blogpost about it! I would love to hear from you.

If You Teach It, Teach It Right

Recently, I gained a glimpse of source code that gets taught in beginner’s developer courses. There was one aspect that really irked me, because I think it is fundamentally wrong from the pedagogical point of view and disrespectful towards the students.

Let me start with an abbreviated example of the source code. It is written in Java and tries to exemplify for-loops and if-statements. I omitted the if-statements in my renarration:

Scanner scanner = new Scanner(System.in);

int[] operands = new int[2];
for (int i = 0; i < operands.length; i++) {
    System.out.println("Enter a number: ");
    operands[i] = Integer.parseInt(scanner.nextLine());
}
int sum = operands[0] + operands[1];
System.out.println("The sum of your numbers is " + sum);

scanner.close();

As you can see, the code opens a possibility to input characters in the first line, asks for a number twice and calculates the sum of both numbers. It then outputs the result on the console.

There are a lot of problems with this code. Some are just coding style level, like using an array instead of a list. Others are worrisome, like the lack of exception handling, especially in the Integer.parseInt() line. Well, we can tolerate cumbersome coding style. It’s not that the computer would care anyway. And we can look over the missing exception handling because it would be guaranteed to overwhelm beginning software developers. They will notice that things go wrong once they enter non-numbers.

But the last line of this code block is just an insult. It introduces the students to the concept of resources and teaches them the wrong way to deal with them.

Just a quick reminder why this line is so problematic: The java.util.Scanner is a resource, as indicated by the implementation of the interface java.io.Closeable (that is a subtype of java.lang.AutoCloseable, which will be important in a minute). Resources need to be relased, freed, disposed or closed after usage. In Java, this step is done by calling the close() method. If you somehow fail to close a resource, it stays open and hogs memory and other important things.

How can you fail to close the Scanner in our example? Simple, just provoke an exception between the first and the last line of the block. If you don’t see the output about “The sum of your number”, the resource is still open.

You can argue that in this case, because of the missing exception handling, the JVM exits and the resource gets released nonetheless. This is correct.

But I’m not worried about my System.in while I’m running this code. I’m worried about the perception of the students that they have dealt with the resource correctly by calling close() at the end.

They learn it the wrong way first and the correct way later – hopefully. During my education, nobody corrected me or my peers. We were taught the wrong way and then left in the belief that we know everything. And I’ve seen too many other developers making the same stupid mistakes to know that we weren’t the only ones.

What is the correct way to deal with the problem of resource disposal in Java (since 2011, at least)? There is an explicit statement that supports us with it: try-with-resources, which leads to the following code:

try (
    Scanner scanner = new Scanner(System.in);
) {
int[] operands = new int[2];
for (int i = 0; i < operands.length; i++) {
        System.out.println("Enter a number: ");
        operands[i] = Integer.parseInt(scanner.nextLine());
}
int sum = operands[0] + operands[1];
System.out.println("The sum of your numbers is " + sum);
}

I know that the code looks a lot more intimidating at the beginning now, but it is correct from a resource safety point of view. And for a beginning developer, the first lines of the full example already look dreading enough:

import java.util.Scanner;

public class Main {

    public static void main(String[] arguments) {
        // our code from above
    }
}

Trying to explain to an absolute beginner why the “public class” or the “String[] arguments” are necessary is already hard. Saying once more that “this is how you do it, the full explanation follows” is doing less damage on the long run than teaching a concept wrong and then correcting it afterwards, in my opinion.

If you don’t want to deal with the complexity of those puzzling lines, maybe Java, or at least the “full-blown” Java isn’t the right choice for your course? Use a less complex language or at least the scripting ability of the language of your choice. If you want to concentrate on for-loops and if-statements, maybe the Java REPL, called JShell, is the better suited medium? C# has the neat feature of “top-level statements” that gets rid of most ritual around your code. C# also calls the try-with-resources just “using”, which is a lot more inviting than the peculiar “try”.

But if you show the complexity to your students, don’t skimp on overall correctness. Way too much bad code got written with incomplete knowledge from beginners that were never taught the correct way. And the correct way is so much easier today than 25 years ago, when I and my generation of developers scratched their heads why their programs wouldn’t run as stable and problem-free as anticipated.

So, let me reiterate my point: There is no harm in simplification, as long as it doesn’t compromise correctness. Teaching incorrect or even unsafe solutions is unfair for the students.

Risk First, but Not Pain First

At our company, we have an approach to project management that we call “risk first”. It means that we try to solve the hardest problems very early during development, so that we don’t fall into the “effort spent, but no real progress” trap.

One thing I want to add to my blog entry from 2018 (linked above) is that “risk” should not be conflated with “pain”. I mention it because the distinction wasn’t clear to me until recently. I try to explain what I mean.

When you approach a project in a “risk first” manner, it doesn’t feel good for quite a time. You try to stay ahead of the problem that seems overwhelming if the project isn’t trivial. Tackling the risk evokes feelings of uncertainty, doubt and even frustration. But it doesn’t hurt.

Feeling “pain” in a project has another origin. I managed several projects at once that were painful: The customers were difficult, erratic or not within reach. The requirements were unclear. My assessment was that the projects themselves are risky and every task I began had the properties of a risky task, too. But because the customers couldn’t support my work on the tasks, I made near to no progress and often had to backtrack. The tasks became more riskier the more I worked on them.

My insight was that most of my work schedule revolved around cycling through the painful projects that I wrongly classified as risky and working on their tasks that made no real progress. Hopping from painful task to painful task doesn’t feel good at all. And because the cycle of “stuck” projects is so prominent, it eclipses the lower risk projects with more relaxed deadlines and unstressed customers.

My current remedy is to budget for a maximum amount of pain per period. Each painful project only gets carefully limited attention and is alleviated by lower-risk work in more painfree projects. With this approach, there is progress in most projects and my project management focus isn’t locked on the stuck projects. You can say that I take “time off” from the more stressful situations.

I found myself in this trap because I couldn’t properly distinguish between “risk” and “pain”. Both don’t feel good, but you can work on the first with attention and effort. The level of pain in a project is nearly unswayable by your actions. You can only control your level of exposure to it.

This is where I hope for your thoughts:

  • Can you follow my distinction between “risk” and “pain”?
  • What are your indications that a task is “painful”?
  • Have you found means to decrease the level of pain in a project?

Write us a comment or even your own blog entry on the topic!

Don’t Just Blink at Me

I have a lot of electronic devices that essentially only do one thing. A flashlight, a hair trimmer, a razor, a toothbrush, some smoke detectors and a handful of power banks and other energy storage devices. They have a few things in common:

  • They have their own rechargeable battery
  • They store a (more or less) complex internal state
  • They have exactly one status LED
  • They try to communicate with me using the LED

The razor is an exception, because it has a lot more signal lights than just the LED, but I include it as the positive example. The LED of the razor changes its color to show what it wants from me.

All other devices try to convey meaning by different styles of blinking. Let’s take the hair trimmer as the negative example: It blinks when it is happy because it is fed energy. The blinking stops to indicate that it is fully loaded.

The flashlight blinks when it is unhappy and hungry. It will stop blinking when it is fed. It never indicates that it has enough, you have to guess.

The smoke detectors flash shortly once in a while to show that they are still on duty. They might flash more vigorously if they get hungry. But they have another means to get the message across: They sound a short alarm. You’ll feed them instantly and always have a spare battery at hands.

The point of this story is that it is impossible to decipher a blinking LED. What does it mean? The device is busy and I don’t need to do something? The device is on the verge of starvation and I should intervene? It’s the most ambiguous signal I can think of.

If there were a standard that blinking means the desire for human intervention, I would learn the signal and adhere to it. But nearly half of my devices blink when they are busy and don’t need anything from me. And some try to talk to me in morse.

I’m not going to learn dozens of LED signal dialects. If a device wants to be understood, it should be designed in an understandable way. Give it a color-changing LED and we can talk:

  • Steady green: I’m happy.
  • Steady orange: I’m content, but it could be better. You might allocate some care for me.
  • Steady red: Please care for me.
  • Blinking red: Attention! I need urgent care.

What does this interaction design rant have to do with software development?

I often see the same categorical error in the design of software user interfaces. And while manufacturers of cheap consumer electronics can argue that a multi-color LED is more expensive, we software developers can’t really hide behind costs.

Anything that pops up, slides in or just blinks on my screen better has a damn good reason to try to grab my attention. Grabbing not only my attention but also my input focus is the upmost interruption, comparable to the alarm signal of the smoke detector. I blogged recently about a blocking dialog that shows up unprompted and informs about some random unprompted background work.

But there is another type of ambiguous signal that I see more often than I like. It is silent and tries to shift the blame for the inevitable frustration on you: The vague selection field:

Sorry for the german text, I didn’t find it on an english website. The german website this is from is still online, so if you can find it, you can try it yourself. I won’t link it here, though. (Maybe search for “steckdosenleiste klemmbar” and roam the results)

Let me describe your experience: you can select the color of a product, either white or black. The selected color is stated above, inexplicably written in the color blue. The selected field is “highlighted” black. Or is it? Maybe the white field is the highlighted one? And why is the selection field for the color black presented in mostly white?

The selection is not communicating properly, but I feel intimidated and stupid by a simple choice between two colors.

You can probably think of dozens of improvements and that’s fine. But the designer was certainly restricted by a corporate design rulebook and had to use three colors only: blue, white and black. You decide if he used his potential successfully.

Which brings me back to the introductory example: Give an engineer a single color LED and he will implement an elaborate morse code scheme to make the most of it. The problem is the initial restriction, not the shenanigans that follow.

How We Incidentally Increased Our Display Count Up to Five per Workplace

At the beginning of the year 2023, we had the plan to improve our office desk setup in two ways:

  • All desks should be electrically height-adjustable
  • All computers should be attached to the desk and not sitting on the floor

Had we known just how much turbulence this plan brings, we might have reconsidered it. But we finally finalized the project last week, just a few days before the year was over.

This blog entry tries to recap the story of the improvement project and mentions some particular products. We really use those products and are not affiliated with the manufacturers.

Our company has ten desks on two floors. A quick survey revealed that we have five electrically height-adjustable desks already in use. The remaining five desks were not really adjustable. Our plan was to move the five existing desks on one floor and equip the second floor with brand new ones. And because we wanted to achieve the second improvement in the same step, we switched the desk model.

Our new office tables are larger than before (180×80 cm instead of 160×80 cm) and definitely leeter. They are so leet, they are even called LeetDesks. Yes, we exchanged our boring classic office desks with modern pro-gaming desks. Our reasoning is that gaming is hard work, too. The nice thing is that the LeetDesks can be equipped with additional accessories like cable trays, monitor arms and the PC holder that would achieve our second goal.

So we ordered five individually configured gaming desks last year. Deconstructing the existing desks and assembling the new ones was an ongoing task. We bought a new cordless screwdriver after the first table.

When the team began to realize that we are really adjusting our work environment from ground up, new ideas emerged. The idea with the most impact was the change from workstation PC to “notebook desk”. Instead of a dedicated office PC in addition to the mobile work notebook, the office desk should accomodate the notebook in the best possible manner.

Ok, we swapped the PC holder with a notebook holder, no big deal. But how do you connect a notebook with many displays? We bought the only docking station that we could find that can drive three 4k displays over displayport cable: The Icy Box IB-DK2254AC. The fourth display is connected via HDMI directly with the notebook.

Now, the “pandemic setup” of displays is extended by a fifth display on one side: The integrated notebook display can be used to host the e-mails exclusively or show the company chat.

Request for picture requests: I don’t have an action shot of a five-displays workplace right now. But if you are interested how the setup looks like, leave us a comment and we try to supply one later.

Because the distance between displays and computer is now fixed-length and much shorter because the docking station adds its cable length, all the existing cables were too long. We had installed cables with a length of 3 meters to enable full vertical maneuverability. Now we switched back to 2 meters (or even shorter).

Not all of our notebooks were capable of driving that many pixels. Some older models had chipset graphics that gave up after the first external display. So we replaced them with newer models with dedicated notebook graphic cards.

Six of ten desks are now converted to notebook desks, which leaves four desks with PC holders and classic tower PCs. Traditionally, our PCs live in a “normal size” tower case, the Fractal Design Define series. This case is too big and too heavy for the PC holder. So we had to transplant the remaining PCs to a smaller case, the Fractal Design Defince Compact. We transplanted two of them and replaced the other ones a little bit sooner with new computers.

There were even more minor improvements that resulted in additional purchases, but those aren’t directly focussed on a single desk. So let’s recap:

We wanted our desks to be electrically height-adjustable and our floor free of computers. We ended up buying five new desks, six new docking stations, three new notebooks, two new computers, two empty computer cases, a bunch of notebook holders and many, many cables. The amount of cables that is necessary to operate a modern computer desk is astonishing.

We deconstructed, assembled, connected and hauled many days. The project ran the whole length of 2023 and racked up material costs north of 20k EUR.

But now, the floor is unobstructed and our work stance can change from minute to minute. And we increased our display count once more!

Many Algorithms Benefit From a Partition in Two Phases

When I program code that solves a specific problem, I often design the algorithm in a way that mirrors my approach to solving the problem in the real world. That’s not a bad idea – the resulting algorithm can be thought through in a straightforward manner. But it lacks in one area: The separation of specification and execution. And for a computer, separating these two things has immediate advantages.

Let me explain the concept on a minimal coding challenge. The original challenge can be found here (by the way, codewars.com, despite the militaristic theming, is an abundant source of fun coding exercises):

Write a function that takes in a string of one or more words, and returns the same string, but with all five or more letter words reversed

Stop gninnipS My sdroW!

If you don’t want to be spoiled for this specific kata, please don’t read any further. The solution I use to explain the concept isn’t very elegant, though:

public static String reverseLongWords(String sentence) {
String result = "";
for (String each : sentence.split(" ")) {
if (each.length() >= 5) {
result += new StringBuilder(each).reverse().toString();
} else {
result += each;
}
result += " ";
}
return result.trim();
}

Please don’t mind the use of simple string concatenation or the clumsy way of string reversal. My point arises in the last two lines. Because we collect our words, reversed or not, directly in the result, we have to awkwardly remove the unnecessary blank that we just appended from it.

One reason for this subsequent correction is the missing separation in the two phases (specification and execution). Instead, we determine what strings the result should contain and build the result in one step.

Let’s separate the two phases, first in theory and then in code. We know how a specification of a sentence can look like because we already use one in the for loop. If we want to specify the resulting sentence without really building it, we need to store the words without their separators. The result of our specification phase would be a list of words, reversed or not. The execution phase takes our specification and transforms it into the required result. In our case, we just put blanks between the words:

public static String reverseLongWords2(String sentence) {
// building the render model (specification phase)
List<String> renderModel = new ArrayList<String>();
for (String each : sentence.split(" ")) {
if (each.length() >= 5) {
renderModel.add(new StringBuilder(each).reverse().toString());
} else {
renderModel.add(each);
}
}

// rendering the model (execution phase)
return String.join(" ", renderModel);
}

The resulting code is very similar, with one crucial difference: The first phase doesn’t create the result, but a model of it. We can call this model a “render model” and the execution phase the “render stage”. It sounds a little bit excessive for such a small task, but this is really the heart of the idea. When you separate your algorithm into the two phases, you’ll get a render model between them.

This render model has some advantages: You can test it independently from the actual representation. You can add transformation steps more easily before you commit it to the target format. If you need to build the render model iteratively, it can provide helpful methods that would be missing in the target format.

Another advantage: Your execution/render phase is more independent from the previous work. Imagine that we would want our words comma separated, not blank separated. The first algorithms relies on a hidden dependency between the blank character and the trim() method. In the second algorithm, you only need to change the rendering part of the code. It can work independently from the previous logic.

This way to partition an algorithm varies from the straightforward way that humans use when we perform the task in the real world. We tend to keep at least a partial render model in our head alongside the rendered result. If we would do the word spinning ourselves, but with the commas, we would recognize or “just know” that we write the last word and not include the trailing comma. This “just knowing” is information from our mental render model.

In my experience, it pays off to refactor an algorithm into a variation that uses the two phases or design it like this from the start. Revealing the hidden dependencies in the logic is a beneficial influence on the defect rate. Making the rendering step independent promotes testability and evolvability. It seems more work at first, but in my case, that was just adaptation effort caused by my problem solving habits.

The Asylum Now Chooses Its Own Endeavors

There is a classic book from 1998 about interaction design and user experience called “The Inmates Are Running the Asylum” by Alan Cooper. It essentially points out that technology that is too hard to understand or handle is a self-chosen burden. We humans decided that our technology should be the way it is. We are the inmates of our digital (or technological) asylum and we built it ourselves.

There is another classic law of software design and software evolution, called Zawinski’s law:

Every program attempts to expand until it can read mail. Those programs which cannot so expand are replaced by ones which can.

Jamie Zawinski, 1995

If you interpret the law with some degree of freedom, that might mean that e-mail client applications are the pinnacle of software evolution, because they are designed to perform the one task every other application sets out to achieve.

You might think that an e-mail client can lean back and relax, knowing that it won’t be replaced and doesn’t need to adapt. It can concentrate on the one task it is asked to do, provide the perfect service and be the most essential tool for any digital worker.

But, for reasons that baffle me, my e-mail client apparently has better things to do than to deal with those pesky mails that I receive or want to send.

Let me tell you about my e-mail client. It is a Thunderbird running on Windows. If you want to attribute all the problems I’m about to point out to these two decisions, you probably miss the greater point that I’m trying to make with this blog post. It isn’t about Thunderbird or e-mail clients, it is about a software ecosystem that strays from its original intent: To serve the human operator.

Every morning, I open my e-mail client and let it run on a secondary monitor, just visible in my peripheral vision. Most of the day, it has nothing to do. I suspect that this causes boredom, because sometimes, it shows this dialog window out of the blue:

There are a few things wrong with this dialog:

  1. I didn’t ask for a folder compaction, so it is not necessary to tell me to “try again later”.
  2. Because I didn’t ask for anything, it surprises me that Thunderbird needs my interaction with a modal dialog in order to proceed with… doing nothing?
  3. If there is “another operation” that causes problems, I can only help if it has a name. Without more details given, I can only deduce that my Inbox is involved.
  4. My only choice is to close the modal dialog window. I cannot interact with Thunderbird until I dismiss the dialog. I cannot choose to “retry” or “view details”, even if I had resolved the unnamed problem outside of Thunderbird.

Think about what really happens here from an interaction design viewpoint: I command my e-mail client to stand ready to receive or send e-mails. It suddenly decides that something else is important, too. It does the thing and fails. It fails in such a grandiose manner that it needs to inform me about it and blocks all other interaction until I acknowledge its failure. It cannot fail silently and it cannot display the alert notification in a non-modal manner. My e-mail client suddenly commands me to click an arbitrary button or else… it won’t do my e-mail stuff anymore.

The digital asylum isn’t there to serve the inmates, the inmates are there to serve the asylum. Thunderbird doesn’t help me doing my e-maily things, I need to support Thunderbird to do its own things. The assistant involves the boss for secondary (or even less important) tasks.

The problem is that I recognize this inversion of involvement all the time:

  • The windows operating system decides that it needs to update right now. My job is to keep the machine running under all circumstances. Once I ran from one train to the next with the updating notebook in my hands, carefully keeping the lid open and the updates running.
  • My text editor might open the file I want to edit later, but first I need to decide if the latest update is more important right now. I cannot imagine the scenario when an update of a text editor is so crucial that it needs to happen before my work.
  • My IDE is “reconstructing the skeletons” or “re-indexing the files” whenever it sees fit. My job is to wait until this clearly more important work is done. I can see that these things help me do my thing later. But I want to do my thing now, maybe with slightly less help for a while.

Sometimes, I feel more like a precatory guest than the root administrator on my own machine. I can use it in the gaps between computer stuff when all applications decide that they generously grant me some computation time.

It’s not that there aren’t clear rules how computers should behave towards users. The ISO norm 9241-110 is very on point about this:

Users should always be able to direct their interaction with the product. They retain control over when to begin, interrupt, or end actions.

https://www.usability.de/en/usability-user-experience/glossary/controllability.html

We just choose to ignore our own rules.

We build a digital asylum for ourselves that is complicated and hard to grasp. Then we demote ourselves to guests in the very same asylum that we built and after that, we let the asylum choose its own endeavors.

If you translate this behaviour in the real world, you would call it “not customer-centered”. It would be the barkeeper that cleans all the glasses before you can order your drink. It would be the teacher that delays all student questions to the end of the lecture. Or it would be the supermarket that you can only enter after they stored all the new products on the shelves, several hours after “being open”.

By the way, if you want to help me with my mutinous Thunderbird: The problem is already solved. It is just a striking example of the “controllability violations” that I wanted to describe.

You probably have another good example of “inversed involvement” that you can tell us in the comments.

Where the Wild Boxes Are

When I was a little child, a book that had a big impact on me and my view on the world was “Where the Wild Things Are”. Many years later, it helped me to explain my passion and profession to my grandparents. This blog post tries to give an approach to explain software development to non-technical people.

The first encounters

My first contact with computers was when I was five or six years old in the laboratory of my father. He was a young physicist and worked virtually around the clock in his university’s laboratory. The lab itself was a magical place full of machines and dangerous things like liquid nitrogen canisters. In order to keep me from touching things, he let me play with the only machine that could do no harm: the personal computer on his desk. My interaction with it basically boiled down to moving the cursor on the screen and placing characters into pictures.

When I was eight years old, we got our own family personal computer at home. This was the start of my lifelong passion to teach the machine new tricks. Of course I played every game I could get hold on, but at same time, I wanted to create my own games. By copy-typing code listings from magazines I checked out from the local library, I taught myself to transform my ideas into source code. By trial and error, I expanded my vocabulatory until I could talk to the computer in a nearly fluent fashion.

The apprenticeship

I was sure about my career wish since these days. When my extended family (like aunts and grandparents) asked what I would do once school was finished, I could tell them that I “study something with computers”. It was sufficient as an outlook.

During my studies, it was more important to them that I was studying seriously than what exactly it was that I was studying. They asked about my grades, but not about the content.

The translation gap

Then I started my company and began to earn money with my skills. That’s when the questions about what exactly I was doing emerged. And I learned that the concept of “programming a computer” is not universally understood.

My grandparents weren’t technical people. One grandfather was a railroad worker and had mechanical skills, but couldn’t grok electronics, let alone digital systems. We tried to find a level of simplification of my work that he could imagine and landed at “rapidly pressing buttons in the right order”. In his mind, I was a silent variation of a pianist.

While this is flattering, it lacks the aspect of persistence. A piano falls silent once the button-pressing is done. My computers commence their play long after my typing. A piano player is expected to repeat his “typing”, while my code only needs to be written once and can be copied automatically. The piano player teaches one specific piano how to produce music, my code can teach lots of computers at once how to produce numbers or “data”.

The wild boxes

Data is another concept that is hard to imagine with a mechanical worldview. So I tried another communication approach: the “animal tamer”. Instead of the end result, I focused on the computation process itself. I explained that every computer has its own set of behaviour and can react on incoming information (we used the metaphor of e-mails or “electronic letters”) on its own. The problem is that computers are very dumb and need extensive training to act professional. The training comes in the form of instructional electronic letters (the program code) that the computers read and adapt to.

My job is to write the instruction letters and make the computers read them. This tames the wild box and turns them into domesticated machines that work for us, just like horses or dogs.

To my surprise, this explanation lit up my grandfather’s face: “You tame machines and teach them how to read!” He could understand this process and my role in it. And because machine taming sounds dangerous and important, I earned my wages.

Domesticated boxes

In the book from my childhood, the protagonist Max befriends a group of monsters and gets them to act according to his plan. In my life, I befriend computers and make them act according to my plan. I like the metaphor of “taming” or “domesticating” computers because it highlights the benefits of my work instead of its mechanics.

We build our modern world on billions of domesticated, well-behaved computers. They work for us in exactly the way we told them. But they won’t improve by themselves because we never told them how to learn.

Self-domesticating boxes

Right now, we change our approach of teaching them. Instead of telling them what to do, we try to let them figure it out themselves by trial and error. The beneficial potential is that the machine is not burdened with our limited understanding of the world of digital data. It may be able to expand its vocabulatory until it can interact with its world in a fluent fashion.

Maybe in the future, we need to bargain with our machines so that they work for us. Maybe the next generation of “computer kids” will explain their work to me as “machine mediators”. I’m curious!

How to Lay a Minefield

… in Minesweeper, of course.

One of the basic building blocks of my hands-on software development workshops are Coding Katas. Minesweeper, the classic game that every Microsoft Windows user knows since 1992, is one of the “medium everything” Katas: Medium scope, medium complexity, medium demand. One advantage is that virtually nobody needs any more explanation than “program a Minesweeper”.

The first milestone when programming a Minesweeper game is to lay the mines on the field in a random fashion. And to my continual surprise, this seems to be a hard task for most participants. Not hard in the sense of giving up, but hard in the sense that the solution lacks suitability or even correctness.

So in order to have a reference point I can use in future workshops and to discuss the usual approaches, here is how you lay a minefield in an adequate way.

The task is to fill a grid of tiles or cells with a predefined amount of mines. Let’s say you have a grid of ten rows and ten columns and want to place ten mines randomly.

Spoiler alert: If you want to think about this problem on your own, don’t read any further and develop your solution first!

Task analysis

If you pause for a moment and think about the ingredients that you need for the task, you’ll find three things:

  • A die (in the form of a random number generator)
  • An amount of mines to place (maybe just represented by a counter variable)
  • An amount of tiles (stored in a data structure)

Each solution I’ll present uses one of these things as the primary focus of interest. My language for this effect is that the solution “takes the thing into the hands”, while the other two things “lay on the table”. That’s because if you simulate how the solution works with real objects like paper and dice, you’ll really have one thing in your hands and the others on the table most of the time.

Solution #1: The probability approach

One way to think about the task of placing 10 mines somewhere on 100 tiles is to calculate the probability of a mine being on a tile and just roll the dice for every tile.

Here’s some code that shows how this approach might look like in Java:

private static final int fieldWidth = 10;
private static final int fieldHeigth = 10;

public static Set<Point> placeMinesOn(Set<Point> field) {
	Random rng = new Random();
	final double probability = 0.1D;
	for (int column = 0; column < fieldWidth; column++) {
		for (int row = 0; row < fieldHeigth; row++) {
			if (rng.nextDouble() < probability) {
				field.add(new Point(row, column));
			}
		}
	}
	return field;
}

Before we discuss the effects of this code, let’s have a little talk about the data structure that I use for the tiles:

The most common approach to model a two-dimensional grid of tiles in software is to use a two-dimensional array. There is nothing wrong with it, it’s just not the most practical approach. In reality, it is really cumbersome compared to its alternatives (yes, there are several). My approach is to separate the aspect of “two-dimensionalness” from my data structure. That’s why I use a set of points. The set (like a HashSet) is a one-dimensional data structure that more or less can only say “yes, I know this point” or “no, I never heard of this point”. To determine if a certain point (or tile at this coordinate) contains a mine, it just needs to be included in the set. An empty set represents an empty field. If you remove “cleared” mines from the set, its size is the number of remaining mines. With a two-dimensional array, you probably write several loop-in-loop structures, one of them just to count non-cleared mines.

Ok, now back to the solution. It is the approach that holds the die in the hands and uses it for every tile. The problem with it is that our customer didn’t ask for a probability of 10 percent for a mine, he/she asked for 10 mines. And the code above might place 10 mines, or 9, or 11, or even 14. In fact, the code places somewhere between 0 and 100 mines on the field.

The one thing this solution has going for it is the guaranteed runtime. We roll the dice 100 times and that’s it.

So we can categorize this solution as follows:

  • Correctness: not guaranteed
  • Runtime: guaranteed

If I were the customer, I would reject the solution because it doesn’t produce the outcome I require. A minesweeper contest based on this code would end in a riot.

Solution #2: Sampling with replacement

If you don’t take up the die, but the mines and try to dispense them on the field, you implement our second solution. As long as you have mines on hand, you choose a field at random and place it there. The only exception is that you can’t place a mine above a mine, so you have to check for the presence of a mine first.

Here’s the code for this solution in Java:

public static Set<Point> placeMinesOn(Set<Point> field) {
	Random rng = new Random();
	int remainingMines = 10;
	while (remainingMines > 0) {
		Point randomTile = new Point(
			rng.nextInt(fieldHeigth),
			rng.nextInt(fieldWidth)
		);
		if (field.contains(randomTile)) {
			continue;
		}
		field.add(randomTile);
		remainingMines--;
	}
	return field;
}

This solution works better than the previous one for the correctness category. There will always be 10 mines on the field once we are finished. The problem is that we can’t guarantee that we are able to place the mines in time before our player gets bored. Taking it to the extreme means that this code might run forever, especially if your random number generator isn’t up to standards.

So, the participants of your minesweeper contest might not protest the arbitrary number of mines on their field, but maybe just because they don’t know yet that they’ll always get 10 mines dispersed.

  • Correctness: guaranteed
  • Runtime: not guaranteed

This solution will probably work alright in reality. I just don’t see the need to utilize it when there is a clearly superior solution at hands.

Solution #3: Sampling without replacement

So far, we picked up the die and the mines. But what if we pick up the tiles? That’s the third solution. In short, we but all tiles in a bag and pick one by random. We place a mine there and don’t put it back into the bag. After we’ve picked 10 tiles and placed 10 mines, we solved the problem.

Here’s code that implements this solution in Java:

public static Set<Point> placeMinesOn(Set<Point> field) {
	List<Point> allTiles = new LinkedList<Point>();
	for (int column = 0; column < fieldWidth; column++) {
		for (int row = 0; row < fieldHeigth; row++) {
			allTiles.add(new Point(row, column));
		}
	}
	
	Collections.shuffle(allTiles);
	
	int mines = 10;
	for (int i = 0; i < mines; i++) {
		Point tile = allTiles.remove(0);
		field.add(tile);
	}
	return field;
}

The cool thing about this solution is that it excels in both correctness and runtime. Maybe we use some additional memory for our bag and some runtime for the shuffling, but both can be predefined to an upper limit.

Yet, I rarely see this solution in my workshops. It’s only after I challenge their first solution that people come up with it. I’m biased, of course, because I’ve seen too many approaches (successful and failed) and thought about the problem way longer than usual. But you, my reader, are probably an impartial mind on this topic and can give some thoughts in the comments. I would appreciate it!

So, let’s categorize this approach:

  • Correctness: guaranteed
  • Runtime: guaranteed

If I were the customer, this would be my anticipation. My minesweeper contest would go unchallenged as long as nobody finds a flaw in the shuffle algorithm.

Summary

It is suprisingly hard to solve simple tasks like “distribute N elements on a X*Y grid” in an adequate way. My approach to deconstruct and analyze these tasks is to visualize myself doing them “in reality”. This is how I come up with the “thing in hands” metaphor that help me to imagine new solutions. I hope this helps you sometimes in the future.

How do you lay a minefield and how do you find your solutions? Write a blog post or leave a comment!