The algorithm in an algorithm – Builder design pattern

In the following blog post, I would like to explain to you, the design pattern builder, why this is an algorithm in the algorithm and what advantages result from it.

General

The builder is a creational design pattern. It separates the construction of complex objects from their representations, allowing the same construction processes to be reused.

The design pattern consists of a director, the builder interface, and concrete builder implementations. The director is responsible for the abstract construction of the product and has a defined interface with the builder to pass the design instructions. The concrete builders then build the concrete product according to the instructions and can also provide the generated product.

So in the end, the director defines its own little programming language inside the program where the construction instructions can be programmed as algorithm. The builder then executes that algorithm. So we have a program in the program, an algorithm in the algorithm. Crazy!

Example cake recipe

We know such procedures from real life. For example, from the kitchen. When you bake a cake, you take your yellow mixing bowl, the ingredients, and the blue mixer and make the dough. Very concrete.

But now, if someone asks about the recipe, then we abstract it from our concrete equipment to a general manual. There, it only says you are mixing the ingredients, and your yellow mixing bowl and blue mixer are not mentioned. So someone else can bake the cake in their own kitchen with their own equipment. Should your blue mixer ever fail, you can easily carry out the recipe with a whisk or with the new food processor.

Example file generation

An example from programming is the generation of a file. For example, a pdf certificate. If you program everything directly in PDFBox, it works first. But if you ever want to use a different library, or if you also want the certificate as a normal text document or image, you need to rewrite everything.

With the design pattern, you would have an algorithm that says I want “certificate” as a title, then a dividing line, then a table and then this paragraph. Exactly how this will be implemented is not known. The PDFBox builder takes these instructions and creates the file with its own library-specific commands.

If the library or file type changes, only one new builder needs to be written. For example, a text file builder, an image builder or an OpenPDF builder. The logic of how the certificate should look at the end remains unchanged.

Conclusion

Finally, separating the construction from the production offers some advantages. The program is more expandable and modifiable. It also complies with the single responsibility principle. The disadvantage is a close coupling between the product, the concrete builder, and the classes involved in the construction, making it difficult to change the basic process.

Many Algorithms Benefit From a Partition in Two Phases

When I program code that solves a specific problem, I often design the algorithm in a way that mirrors my approach to solving the problem in the real world. That’s not a bad idea – the resulting algorithm can be thought through in a straightforward manner. But it lacks in one area: The separation of specification and execution. And for a computer, separating these two things has immediate advantages.

Let me explain the concept on a minimal coding challenge. The original challenge can be found here (by the way, codewars.com, despite the militaristic theming, is an abundant source of fun coding exercises):

Write a function that takes in a string of one or more words, and returns the same string, but with all five or more letter words reversed

Stop gninnipS My sdroW!

If you don’t want to be spoiled for this specific kata, please don’t read any further. The solution I use to explain the concept isn’t very elegant, though:

public static String reverseLongWords(String sentence) {
String result = "";
for (String each : sentence.split(" ")) {
if (each.length() >= 5) {
result += new StringBuilder(each).reverse().toString();
} else {
result += each;
}
result += " ";
}
return result.trim();
}

Please don’t mind the use of simple string concatenation or the clumsy way of string reversal. My point arises in the last two lines. Because we collect our words, reversed or not, directly in the result, we have to awkwardly remove the unnecessary blank that we just appended from it.

One reason for this subsequent correction is the missing separation in the two phases (specification and execution). Instead, we determine what strings the result should contain and build the result in one step.

Let’s separate the two phases, first in theory and then in code. We know how a specification of a sentence can look like because we already use one in the for loop. If we want to specify the resulting sentence without really building it, we need to store the words without their separators. The result of our specification phase would be a list of words, reversed or not. The execution phase takes our specification and transforms it into the required result. In our case, we just put blanks between the words:

public static String reverseLongWords2(String sentence) {
// building the render model (specification phase)
List<String> renderModel = new ArrayList<String>();
for (String each : sentence.split(" ")) {
if (each.length() >= 5) {
renderModel.add(new StringBuilder(each).reverse().toString());
} else {
renderModel.add(each);
}
}

// rendering the model (execution phase)
return String.join(" ", renderModel);
}

The resulting code is very similar, with one crucial difference: The first phase doesn’t create the result, but a model of it. We can call this model a “render model” and the execution phase the “render stage”. It sounds a little bit excessive for such a small task, but this is really the heart of the idea. When you separate your algorithm into the two phases, you’ll get a render model between them.

This render model has some advantages: You can test it independently from the actual representation. You can add transformation steps more easily before you commit it to the target format. If you need to build the render model iteratively, it can provide helpful methods that would be missing in the target format.

Another advantage: Your execution/render phase is more independent from the previous work. Imagine that we would want our words comma separated, not blank separated. The first algorithms relies on a hidden dependency between the blank character and the trim() method. In the second algorithm, you only need to change the rendering part of the code. It can work independently from the previous logic.

This way to partition an algorithm varies from the straightforward way that humans use when we perform the task in the real world. We tend to keep at least a partial render model in our head alongside the rendered result. If we would do the word spinning ourselves, but with the commas, we would recognize or “just know” that we write the last word and not include the trailing comma. This “just knowing” is information from our mental render model.

In my experience, it pays off to refactor an algorithm into a variation that uses the two phases or design it like this from the start. Revealing the hidden dependencies in the logic is a beneficial influence on the defect rate. Making the rendering step independent promotes testability and evolvability. It seems more work at first, but in my case, that was just adaptation effort caused by my problem solving habits.