Many Algorithms Benefit From a Partition in Two Phases

When I program code that solves a specific problem, I often design the algorithm in a way that mirrors my approach to solving the problem in the real world. That’s not a bad idea – the resulting algorithm can be thought through in a straightforward manner. But it lacks in one area: The separation of specification and execution. And for a computer, separating these two things has immediate advantages.

Let me explain the concept on a minimal coding challenge. The original challenge can be found here (by the way, codewars.com, despite the militaristic theming, is an abundant source of fun coding exercises):

Write a function that takes in a string of one or more words, and returns the same string, but with all five or more letter words reversed
Stop gninnipS My sdroW!

If you don’t want to be spoiled for this specific kata, please don’t read any further. The solution I use to explain the concept isn’t very elegant, though:

public static String reverseLongWords(String sentence) {
    String result = "";
    for (String each : sentence.split(" ")) {
        if (each.length() >= 5) {
            result += new StringBuilder(each).reverse().toString();
        } else {
            result += each;
        }
        result += " ";
    }
    return result.trim();
}

Please don’t mind the use of simple string concatenation or the clumsy way of string reversal. My point arises in the last two lines. Because we collect our words, reversed or not, directly in the result, we have to awkwardly remove the unnecessary blank that we just appended from it.

One reason for this subsequent correction is the missing separation in the two phases (specification and execution). Instead, we determine what strings the result should contain and build the result in one step.

Let’s separate the two phases, first in theory and then in code. We know how a specification of a sentence can look like because we already use one in the for loop. If we want to specify the resulting sentence without really building it, we need to store the words without their separators. The result of our specification phase would be a list of words, reversed or not. The execution phase takes our specification and transforms it into the required result. In our case, we just put blanks between the words:

public static String reverseLongWords2(String sentence) {
    // building the render model (specification phase)
    List<String> renderModel = new ArrayList<String>();
    for (String each : sentence.split(" ")) {
        if (each.length() >= 5) {
            renderModel.add(new StringBuilder(each).reverse().toString());
        } else {
            renderModel.add(each);
        }
    }
    
    // rendering the model (execution phase)
    return String.join(" ", renderModel);
}

The resulting code is very similar, with one crucial difference: The first phase doesn’t create the result, but a model of it. We can call this model a “render model” and the execution phase the “render stage”. It sounds a little bit excessive for such a small task, but this is really the heart of the idea. When you separate your algorithm into the two phases, you’ll get a render model between them.

This render model has some advantages: You can test it independently from the actual representation. You can add transformation steps more easily before you commit it to the target format. If you need to build the render model iteratively, it can provide helpful methods that would be missing in the target format.

Another advantage: Your execution/render phase is more independent from the previous work. Imagine that we would want our words comma separated, not blank separated. The first algorithms relies on a hidden dependency between the blank character and the trim() method. In the second algorithm, you only need to change the rendering part of the code. It can work independently from the previous logic.

This way to partition an algorithm varies from the straightforward way that humans use when we perform the task in the real world. We tend to keep at least a partial render model in our head alongside the rendered result. If we would do the word spinning ourselves, but with the commas, we would recognize or “just know” that we write the last word and not include the trailing comma. This “just knowing” is information from our mental render model.

In my experience, it pays off to refactor an algorithm into a variation that uses the two phases or design it like this from the start. Revealing the hidden dependencies in the logic is a beneficial influence on the defect rate. Making the rendering step independent promotes testability and evolvability. It seems more work at first, but in my case, that was just adaptation effort caused by my problem solving habits.

Thank you for sharing your thoughts and experience on partitioning algorithms into two phases.
But aren’t there actually three of them? I think, the code structure (in terms of readability and testability) could benefit even further from separating model building from model transformation as well.

We then would have three phases:

1. Building the render model
2. Transforming the model or model items – maybe in multiple steps
3. Rendering the model

In the example above, this leads to three functions (is there a way to format code in comments here?):

// phase 1
static List buildRenderModel(String sentence) {
return Arrays.asList(sentence.split(” “));
}

// phase 2
static String reverseLongWord(String word) {
if (word.length() >= 5) {
return new StringBuilder(word).reverse().toString();
}
return word;
}

// phase 3
static String renderTheModel(List renderModel) {
return String.join(” “, renderModel);
}

Now we have the building blocks to write down the algorithm a bit cleaner.
As a bonus, we could implement the rendering slightly different:

// phase 3 – alternative
static Collector renderTheModel() {
return Collectors.joining(” “);
}

Which finally enables us to use Java’s streaming API and write:

static String reverseLongWords3(String sentence) {
return buildRenderModel(sentence)
.stream()
.map(Transformer::reverseLongWord)
.collect(renderTheModel());
}

3 thoughts on “Many Algorithms Benefit From a Partition in Two Phases”

Ronny Händel says:

December 17, 2023 at 3:08 pm

Thank you for sharing your thoughts and experience on partitioning algorithms into two phases.
But aren’t there actually three of them? I think, the code structure (in terms of readability and testability) could benefit even further from separating model building from model transformation as well.

We then would have three phases:

1. Building the render model
2. Transforming the model or model items – maybe in multiple steps
3. Rendering the model

In the example above, this leads to three functions (is there a way to format code in comments here?):

// phase 1
static List buildRenderModel(String sentence) {
return Arrays.asList(sentence.split(” “));
}

// phase 2
static String reverseLongWord(String word) {
if (word.length() >= 5) {
return new StringBuilder(word).reverse().toString();
}
return word;
}

// phase 3
static String renderTheModel(List renderModel) {
return String.join(” “, renderModel);
}

Now we have the building blocks to write down the algorithm a bit cleaner.
As a bonus, we could implement the rendering slightly different:

// phase 3 – alternative
static Collector renderTheModel() {
return Collectors.joining(” “);
}

Which finally enables us to use Java’s streaming API and write:

static String reverseLongWords3(String sentence) {
return buildRenderModel(sentence)
.stream()
.map(Transformer::reverseLongWord)
.collect(renderTheModel());
}

1. daniel.lindner says:
  
  December 18, 2023 at 10:15 am
  
  Hello Ronny,
  Wow! That is a nice continuation of my idea and I agree. The resulting algorithm in this case is a variation of the classic “filter – map – reduce” schema, as you pointed out in your Java stream example.
  
  May I use this stream of thoughts in my lectures? It would be a nice cross-over of algorithm design, refactoring and Java Streams (or transformation pipelines in general).
  
  Thank you!
  
  1. Anonymous says:
    
    December 18, 2023 at 7:45 pm
    
    Hello Daniel,
    I’m happy to contribute. Feel free to use it in your lectures.
    
    Unfortunately, WordPress has eaten the generics parameters. But I’m sure you will figure them out.

	Anonymous on Cache configuration with WildF…
	Miq on Nested queries like N+1 in pra…
	mariuselvert on Creating functors with lambda…
	Nested queries like… on Common SQL Performance Gotchas…
	Nested queries like… on Make your users happy by not c…

Share this:

Related

3 thoughts on “Many Algorithms Benefit From a Partition in Two Phases”

Leave a reply to daniel.lindner Cancel reply