Arrow Anti-Pattern

When you write code, it can happen that you nest some ifs or loops inside each other. Here is an example:

Because of the shape of the indentation, this code smell is called an anti-arrow pattern. The deepest indentation depth is the tip of the arrow. In my opinion, such a style is detrimental to readability and comprehension.

In the following, I would like to present a simple way of resolving such arrow anti-patterns.

Extract Method

First we extract the arrow pattern as a new method. This allows us to use return values instead of variable assignments and makes the code clearer.

public string PrintElephantMessage(Animal animal)
{
    Console.WriteLine(IsAElephant(animal));
}
public string IsAElephant(Animal animal)
{
    if (animal.IsMammal())
    {
        if (animal.IsGrey())
        {
            if (animal.IsBig())
            {
                if (animal.LivesOnLand())
                {
                    return "It is an elephant";
                }
                else
                {
                    return "It is not an elephant. Elephants live on land";
                }
            }
            else
            {
                return "It is not an elephant. Elephants are big";
            }
        }
        else
        {
            return "It is not an elephant. Elephants are grey";
        }
    }
    else
    {
        return "It is not an elephant. Elephants are mammals";
    }
}

Turn over ifs

A quick way to eliminate the arrow anti-pattern is to invert the if conditions. This will make the code look like this:

public string IsAElephant(Animal animal)
{
    if (!animal.IsMammal())
    {
        return "It is not an elephant. Elephants are mammals";
    }
    if (!animal.IsGrey())
    {
        return "It is not an elephant. Elephants are grey";
    }
    if (!animal.IsBig())
    {
        return "It is not an elephant. Elephants are big";
    }
    if (!animal.LivesOnLand())
    {
        return "It is not an elephant. Elephants live on land";
    }
    return "It is an elephant";
}

Some IDEs like Visual Studio can help flip the Ifs.

Conclusion

Arrow anti-pattern are code smells and make your code less legible. Fortunately, you can refactor the code with a few simple steps.

How to migrate a create-react-app project to vite

It seems that the React community is finally accepting that their old way of scaffolding a new projects, create-react-app (CRA in short), has outlived its usefulness. While there is no official statement about that, there was no update on npm in about a year, which in the JS universe screams “TOXIC WASTE” in very clear words, and meanwhile also has vanished from the official “Start a new React Project” docs.

In search for possibilities, one can do some quick google searches (e.g. this or that or maybe this) and at the moment, I’m giving vite a chance and it has not disappointed me yet, as the opposite:

  • the build definitely feels faster (as the French would say: plus vite), but I never quantified it
  • that over 9000 deprecation warnings one was accustomed to using CRA – gone TO ZERO
  • and the biggest point, no dependency on webpack. Webpack has this weird custom to introduce brutally breaking changes between their versions and then you have to polyfill Node JS core modules or whatever floats their boat, giving users not a choice – i.e. making it highly TOXIC in itself

But still, the react-scripts which CRA employs have played quite a role in development, as it also helped with the “npm start” development server and also as a test runner – so generally, if you have developed your project over some years, you might have relied on it quite a bit, and now you don’t want to recreate everything from scratch.

I recently migrated one of our projects and this is what worked for me. There were three main concerns

  • switch the general infrastructure to vite, so we can develop and build again
  • introduce vitest as a test runner
  • migrate Redux store tests specifically

Let’s focus today on the thing without tests and I will come back to that next time.

Migrate to vite INFRASTRUCTURE

This was actually surprisingly concise, I just had to

npm install -D vite @vitejs/plugin-react
npm uninstall react-scripts

(when in doubt, remove the node_modules folder and run npm install again, but I didn’t have to), then I adjusted package.json to:

  "scripts": {
    "start": "vite",
    "build": "vite build", 
  },

You might prefer to call your dev server via “npm run dev” instead of “npm start”, in that case just replace the "start": "vite" with "dev": "vite" above.

The Vite templates prefer to include a script "preview": "vite preview" but I do not use it, so I didn’t copy that.

It also was required to set this package.json entry:

  // somewhere top-level, i.e. next to "version" or somewhere like that
  "type": "module",

(I’m not entirely sure whether we can now safely remove the “browserslist” or “babel” entries from the package.json because they might be useless now, but I will have to think about in another minute.)

Now, some real code changes. One of the larger todos here might be to make sure that every JSX-containing source file ends with .jsx – there have been discussions about this and beforehand, it was still possible to just place your <App/> etc. inside an App.js, but vite does not like that anymore, so this is a thing you have to do.

So the code changes amount to:

  • Rename every .js file which has some JSX in it to .jsx – pro tip: do it via the IDE so you do not have to care for every import / require-Statement manually!
  • move the template in ./public/index.html directly to ./index.html and in there, replace every mentioning of %PUBLIC_URL% just by the single slash /
  • In the index.html <body>, include your index.jsx e.g. like:
  <body>
    <noscript>You need to enable JavaScript to run this app.</noscript>
    <div id="root"></div>
    <script type="module" src="/src/index.jsx"></script>
  </body>

It might be said that the vite templates like to call their index file “main.jsx”, but it’s not important – just match whatever you put inside the <script src="..."/>.

Now in order not to change your habits too much, i.e. keep your CI build as it is, plus maybe some Docker Dev Containers or even browser bookmarks, you can use this vite.config.js – see docs:

import { defineConfig } from 'vite';
import react from '@vitejs/plugin-react';

export default defineConfig({
  plugins: [react()],
  server: {
    port: 3000,
    host: true
  },
  build: {
    outDir: './build'
  },
});

otherwise, vite prefers to run its dev server on port 5173 (guess it’s Leetspeak) and build in ./dist – just so you know.

Addon: Using ReactComponents from SVGs with Vite. Also with refs.

Since today morning, when I wrote this article, I already learned something new. In another project we were importing SVG files via the approach

import {ReactComponent as Bla} from "./bla.svg";

const ExampleUsage = () => {
  return <Bla />;
};

Doing so now results in

Uncaught SyntaxError: ambiguous indirect export: ReactComponent

This can be solved by npm install vite-plugin-svgr and then updating vite.config.js:

import {defineConfig} from "vite";
import react from "@vitejs/plugin-react";
import svgr from "vite-plugin-svgr";

export default defineConfig({
    plugins: [
        svgr({
            svgrOptions: {
                ref: true,
            },
        }),
        react(),
    ],
    server: {
        port: 3000,
        host: true,
    },
    build: {
        outDir: "./build",
    },
});

The { svgrOptions: {ref: true} } was a specific requirement for our use case, it is necessary if you ever want to access the imported ReactComponents ref; i.e. in our ExampleUsage we needed a specification <Bla ref={...}/> . Leaving the svgrOption ref then at false (its default) gives us the error:

Warning: Function components cannot be given refs. Attempts to access this ref will fail. Did you mean to use React.forwardRef()?

Then, Make the tests work again

As mentioned above, these were a bit trickier, and while I found a way to leave most tests untouched, there was some specific tweaking to be done with Redux store tests, and also with mocking a foreign class (GraphQLClient from “graphql-request” in my case).

But as also mentioned above, I guess this might be a topic for my next blog post. In case you urgently need that knowledge, drop us a mail or something.. 🙂

Have we made things too easy?

One of the old mantras for API design is “Make doing the right thing easy and the wrong thing hard”. This, of course, applies to much broader topics as well, such as software development or UX.

For software development specifically, are we maybe making “doing the wrong thing” too easy as well? Here are a two examples:

Web Requests

In the old times, requesting data from a web server required first setting up the request, sending it, and then getting the result back to your application either via polling or callbacks. Dave Mark once adequately called this solving the “waiting problem”. It was cumbersome, to say the least. It was clear that making such a request was something to be avoided. You did it when you had to, but you avoided setting up too many different kinds of requests implictly.

Nowadays, with the advent anonymous functions/lambdas in most mainstream programming languages, continuations became the new way handle these things: do_request(...).then(result -> ...) This already made this a lot easier. And even better, now we have some form of coroutines in many languages were you can just do result = await do_request(...). It even looks almost like a normal function call.

With this, programmers can just do requests one after the other. Need one thing from a server? Do one request. Need ten things from a server? Do ten requests. Of course, this is horribly wasteful: each request will incur the full overhead of http/https and a server roundtrip. In the old times, doing the request was painful, so you automatically looked for ways to avoid doing more, and bundle your asks into one request, argueable leading to a better program.

Dependencies

Before nice package-managers where a thing, handling dependencies was a huge pain. You would have to manually get, unpack, configure and install the dependency for each developer and/or consumer system. As a consequence, libraries were big and often duplicated foundational things. But it also caused developers carefully grooming their library selections.

Now with package managers, libraries have started to become small. Duplication within libraries certainly seems to have decreased, and the average library size has decreased. But this also caused developers to be much less cautious when adopting a dependency, with package managers handling thousands of dependencies that no one developer can possibly have a full understanding of. And this then leads to things like the leftpad disaster.

Better or worse?

I am pretty sure that both having nice abstractions to deal with asynchronicity and package managers are good things. But if they make certain things too easy, how can we deal with that? The only thing I can currently think of is figuratively sticking warning-labels on these things during review time, but because those things are now so easy and subtle, it is also easy to miss them.

Are there other examples were we maybe made the wrong thing too easy? Do you have any ideas how to deal with this problem?

Grails Domain update optimisation

As many readers may know we are developing and maintaining some Grails applications for more than 10 years now. One of the main selling points of Grails is its domain model and object-relational-mapper (ORM) called GORM.

In general ORMs are useful for easy and convenient development at the cost of a bit of performance and flexibility. One of the best features of GORM is the availability of several flexible APIs for use-cases where dynamic finders are not enough. Let us look at a real-world example.

The performance problem

In one part of our application we have personal messages that are marked as read after viewing. For some users there can be quite a lot messages so we implemented a “mark all as read”-feature. The naive implementation looks like this:

def markAllAsRead() {
    def user = securityService.loggedInUser
    def messages = Messages.findAllByUserAndTimelineEntry.findAllByAuthorAndRead(user, false)
    messages.each { message ->
        message.read = true
        message.save()
    }
    Messages.withSession { session -> session.flush()}
 }

While this is both correct and simple it only works well for a limited amount of messages per user. Performance will degrade because all the domain objects are loaded into domain objects, then modified and save one-by-one to the session. Finally the session is persisted to the database. In our use case this could take several seconds which is much too long for a good user experience.

DetachedCriteria to the rescue

GORM offers a much better solution for such use-cases that does not sacrifice expressiveness. Instead it offers a succinct API called “Where Queries” that creates DetachedCriteria and offers batch-updates.

def markAllAsRead() {
    def user = securityService.loggedInUser
    def messages = Messages.where {
        read == false
        addressee == user
    }
    messages.updateAll(read: true)
}

This implementation takes only a few milliseconds to execute with the same dataset as above which is de facto native SQL performance.

Conclusion

Before cursing GORM for bad performance one should have a deeper look at the alternative querying APIs like Where Queries, Criteria, DetachedCriteria, SQL Projections and Restrictions to enhance your ORM toolbox. Compared to dynamic finders and GORM-methods on domain objects they offer better composability and performance without resorting to HQL or plain SQL.

The Optional Wildcast

This blog post presents a particular programming technique that I happen to use more often in recent months. It doesn’t state that this technique is superior or more feasible than others. It’s just a story about a different solution to an old programming problem.

Let’s program a class hierarchy for animals, in particular for mammals and birds. You probably know where this leads up to, but let’s start with a common solution.

Both mammals and birds behave like animals, so they are subclasses of it. Birds have the additional behaviour of laying eggs for reproduction. We indicate this feature by implementing the Egglaying interface.

Mammals feed their offsprings by giving them milk. There are two mammals in our system, a cow and the platypus. The cow behaves like the typical mammal and gives a lot of milk. The platypus also feeds their young with milk, but only after they hatched from their egg. Yes, the platypus is a rare exception in that it is both a mammal and egglaying. We indicate this odd fact by implementing the Egglaying interface, too.

If our code wants to access the additional methods of the Egglaying interface, it has to check if the given object implements it and then upcasts it. I call this type of cast “wildcast” because they seem to appear out of nowhere when reading the code and seemingly don’t lead up or down the typical type hierarchy. Why would a mammal lay eggs?

One of my approaches that I happen to use more often recently is to indicate the existence of real wildcast with a Optional return type. In theory, you can wildcast from anywhere to anyplace you want. But only some of these jumps have a purpose in the domain. And an explicit casting method is a good way to highlight this purpose:

public abstract class Mammal {
	public Optional<Egglaying> asEgglaying() {
		return Optional.empty();
	}
}

The “asEgglaying()” method might return an Egglaying object, or it might not. As you can see, on default, it returns only an empty Optional. This means that no cow, horse, cat or dog has to think about laying eggs, they just aren’t into it by default.

public class Platypus extends Mammal implements Egglaying {
	@Override
	public Optional<Egglaying> asEgglaying() {
		return Optional.of(this);
	}
}

The platypus is another story. It is the exception to the rule and knows it. The code “Optional.of(this)” is typical for this coding technique.

A client that iterates over a collection of mammals can now incorporate the special case with more grace:

for (Mammal each : List.of(mammals())) {
	each.lactate();
	each.asEgglaying().ifPresent(Egglaying::breed);
}

Compare this code with a more classic approach using a wildcast:

for (Mammal each : List.of(mammals())) {
	each.lactate();
	if (each instanceof Egglaying) {
		((Egglaying) each).breed();
	}
}

My biggest grief with the classic approach is that the instanceof is necessary for the functionality, but not guided by the domain model. It comes as a surprise and has no connection to the Mammal type. In the Optional wildcast version, you can look up the callers of “asEgglaying()” and see all the special code that is written for the small number of mammals that lay eggs. In the classic approach, you need to search for conditional upcasts or separate between code for birds and special mammal code when looking up the callers.

In my real-world projects, this “optional wildcast” style facilitates domain discovery by code completion and seems to lead me to more segregated type systems. These impressions are personal and probably biased, so I would like to hear from your experiences or at least opinions in the comments.

Cancellation Token in C#

Perhaps you know this problem from your first steps with asynchronous functions. You want to call them, but a CancellationToken is requested. But where does it come from if you do not have one at hand? At first glance, a quick way is to use an empty CancellationToken. But is that really the solution?

In this blog article, I explain what CancellationTokens are for and what the CancellationTokenSource is for.

Let us start with a thought experiment: You have an Application with different Worker threads. The Application is closed. How can you make the whole Application with all the Worker threads close?
One solution would be a global bool variable “closed”, which is set in this case. All Workers have a loop built in, which regularly checks this bool and terminates if it is set. Basically, the Application and all Workers must cooperate and react to the signal.

class Application
{
    public static bool closed = false;
    Application()
    {
        var worker= new Worker();
        worker.Run();

        Thread.Sleep(2500);
        closed = true;

        // Wait for termination of Tasks
        Thread.Sleep(2500);
    }
}
class Worker
{
    public async void Run()
    {
        bool moreToDo = true;
        while (moreToDo)
        {
            await DoSomething();
            if (Application.closed)
            {
                return;
            }
        }
    }
}

This is what the CancellationToken can be used for. The token is passed to the involved Workers and functions. Throught it, the status of whether or not it was cancelled can be queried. The token is provided via a CancellationTokenSource, which owns and manages the token.

class Application
{
    Application()
    {
        var source = new CancellationTokenSource();
        var token = source.Token;
        var worker = new Worker();
        worker.Run(token);

        Thread.Sleep(2500);
        source.Cancel();

        // Wait for cancellation of Tasks and dispose source
        Thread.Sleep(2500);
        source.Dispose();
    }
}
class Worker
{
    public async void Run(CancellationToken token)
    {
        bool moreToDo = true;
        while (moreToDo)
        {
            await DoSomething(token);
            if (token.IsCancellationRequested)
            {
                return;
            }
        }
    }   
}

The call to source.Cancel() sets the token to canceled. CancellationTokens and sources offer significantly more built-in possibilities, such as callbacks, WaitHandle, CancelAfter and many more.

So using an empty CancellationToken is not a good solution. This means that cancellations cannot be passed on and a task could continue to run in the background even though the Application should have been terminated a long time ago.

I hope I have been able to bring the topic a little closer to you. If you have already encountered such problems and found solutions, please leave a comment.

Using Docker Containers in Development with WebStorm: Next Iteration

We are always in pursue of improving our build and development infrastructures. Who isn’t?

At Softwareschneiderei, we have about five times as many projects than we have developers (without being overworked, by the way) and each of that comes with its own requirements, so it is important to be able to switch between different projects as easily as cloning a git repository, avoiding meticulous configuration of your development machines that might break on any change.

This is the main advantage of the development container (DevContainer) approach (with Docker being the major contestant at the moment), and last November, I tried to outline my then-current understanding of integrating such an approach with the JetBrains IDEs. E.g. for WebStorm, there is some kind of support for dockerized run configurations, but that does some weird stuff (see below), and JetBrains did not care enough yet to make that configurable, or at least to communicate the sense behind that.

Preparing our Dev Container

In our projects, we usually have at least two Docker build stages:

  • one to prepare the build platform (this will be used for the DevContainer)
  • one to execute the build itself (only this stage copies actual sources)

There might be more (e.g. for running the build in production, or for further dependencies), but the basic distinction above helps us to speed up the development process already. (Further reading: Docker cache management)

For one of our current React projects (in which I chose to try Vite in favor of the outdated Create-React-App, see also here), the Dockerfile might look like

# --------------------------------------------
FROM node:18-bullseye AS build-platform

WORKDIR /opt
COPY package.json .
COPY package-lock.json .

# see comment below
RUN npm install -g vite

RUN npm ci --ignore-scripts
WORKDIR /opt/project

# --------------------------------------------
FROM build-platform AS build-stage

RUN mkdir -p /build/result
COPY . .
CMD npm run build && mv dist /build/result/app

The “build platform” stage can then be used as our Dev Container, from the command line as (assuming, this Dockerfile resides inside your project directory where also src/ etc. are chilling)

docker build -t build-platform-image --target build-platform .
docker run --rm -v ${PWD}:/opt/project <command_for_starting_dev_server>

Some comments:

  • The RUN step to npm install -g vite is required for a Vite project because the our chosen base image node:18-bullseye does not know about the vite binaries. One could improve that by adding another step beforehand, only preparing a vite+node base image and taking advantage of Docker caching from then on.
  • We specifically have to take the WORKDIR /opt/project because our mission statement is to integrate the whole thing with WebStorm. If you are not interested in that, that path is for you to choose.

Now, if we are not working against any idiosyncrasies of an IDE, the preparation step “npm ci” gives us all our node dependencies in the current directory inside a node_modules/ folder. Because this blog post is going somewhere, already now we chose to place that node_modules in the parent folder of the actual WORKDIR. This will work because for lack of an own node_modules, node will find it above (this fact might change with future Node versions, but for now it holds true).

The Challenge with JetBrains

Now, the current JetBrains IDEs allow you to run your project with the node interpreter (containerized within the node-platform image) in the “Run/Debug Configurations” window via

“+” ➔ “npm” ➔ Node interpreter “Add…” ➔ “Add Remote” ➔ “Docker”

then choose the right image (e.g. build-platform-image:latest).

Now enters that strange IDE behaviour that is not really documented or changeable anywhere. If you run this configuration, your current project directory is going to be mounted in two places inside the container:

  • /opt/project
  • /tmp/<temporary UUID>

This mounting behaviour explains why we cannot install our node_modules dependencies inside the container in the /opt/project path – mounting external folders always override anything that might exist in the corresponding mount points, e.g. any /opt/project/node_modules will be overwritten by force.

As we cared about that by using the /opt parent folder for the node_modules installation, and we set the WORKDIR to be /opt/project one could think that now we can just call the development server (written as <command_for_starting_dev_server> above).

But we couldn’t!

For reasons that made us question our reality way longer than it made us happy, it turned out that the IDE somehow always chose the /tmp/<uuid> path as WORKDIR. We found no way of changing that. JetBrains doesn’t tell us anything about it. the “docker run -w / --workdir” parameter did not help. We really had to use that less-than-optimal hack to modify the package.json “scripts” options, by

 "scripts": {
    "dev": "vite serve",
    "dev-docker": "cd /opt/project && vite serve",
    ...
  },

The “dev” line was there already (if you use create-react-app or something else , this calls that something else accordingly). We added another script with an explicit “cd /opt/project“. One can then select that script in the new Run Configuration from above and now that really works.

We do not like this way because doing so, one couples a bad IDE behaviour with hard coded paths inside our source files – but at least we separate it enough from our other code that it doesn’t destroy anything – e.g. in principle, you could still run this thing with npm locally (after running “npm install” on your machine etc.)

Side note: Dealing with the “@esbuild/linux-x64” error

The internet has not widely adopteds Vite as a scaffolding / build tool for React projects yet and one of the problems on our way was a nasty error of the likes

Error: The package "esbuild-linux-64" could not be found, and is needed by esbuild

We found the best solution for that problem was to add the following to the package.json:

"optionalDependencies": {
    "@esbuild/linux-x64": "0.17.6"
}

… using the “optionalDependencies” rather than the other dependency entries because this way, we still allow the local installation on a Windows machine. If the dependency was not optional, npm install would just throw an wrong-OS-error.

(Note that as a rule, we do not like the default usage of SemVer ^ or ~ inside the package.json – we rather pin every dependency, and do our updates specifically when we know we are paying attention. That makes us less vulnerable to sudden npm-hacks or sneaky surprises in general.)

I hope, all this information might be useful to you. It took us a considerable amount of thought and research to come to this conclusion, so if you have any further tips or insights, I’d be glad to hear from you!

Unit-Testing Deep-Equality in C#

In the suite of redux-style applications we are building in C#, we are making extensive use of value-types, which implies that a value compares as equal exactly if all of its contents are equal also known as “deep equality”, as opposed to “reference equality” or “shallow equality”. Both of those imply deep equality, but the other way around is not true. The same object is of course equal to itself, not matter how deep you look. And an object that references the same data as another object also has equal content. But a simple object that contains different lists with equal content will be unequal under shallow comparison, but equal under deep comparison.

Though init-only records already provide a per-member comparison as Equals be default, this fails for collection types such as ImmutableList<> that, against all intuition but in accordance to , only provide reference-equality. For us, this means that we have to override Equals for any value type that contains a collection. And this is were the trouble starts. Once Equals is overridden, it’s extremely easy to forget to also adapt Equals when adding a new property. Since our redux-style machinery relies on a proper “unequal”, this would manifest in the application as a sporadically missing UI update.

So we devised a testing strategy for those types, using a little bit of reflection:

  1. Create a sample instance of the value type with no member retaining its default value
  2. Test, by going over all properties and comparing to the same property in a default instance, if indeed all members in the sample are non-default
  3. For each property, run Equals the sample instance to a modified sample instance with that property set to the value from a default instance.

If step 2 fails, it means there’s a member that’s still at its default value in the sample instance, e.g. the test wasn’t updated after a new property was added. If step 3 fails, the sample was updated, but the new property is not considered in Equals – and it can even tell which property is missing.

The same problems of course arise with GetHashCode, but are usually less severe. Forgetting to add a property just makes collisions more likely. It can be tested much in the same way, but can potentially lead to false positives: collisions can occur even if all properties are correctly considered in the function. In that case, however, the sample can usually be altered to remove the collision – and it is really unlikely. In fact, we never had a false positive.

Format-based sorting looks clever, but is dangerous

A neat trick I learnt early in my career, even before I learnt about version control, was how to format a date as a string so that alphabetically sorted lists would contain them in the “correct” order:

“YYYYMMDD” is the magic string.

If you format your dates as 20230122 and 20230123, the second name will be sorted after the first one. With nearly any other format, your date strings will not be sorted chronologically in the file system.

I’ve found out that this is also nearly the only format that most people cannot intuitively recognize as a date. So while it is familiar with me and conveniently sorted, it is confusing or at least in need of explanation for virtually every user of my systems.

Keep that in mind when listening to the following story:

One project I adopted is a custom enterprise resource planning system that was developed by a single developer that one day left the company and the code behind. The software was in regular use and in dire need of maintenance and new features.

One concept in the system is central to its users: the list of items in an invoice or a bill of delivery. This list contains items in a defined order that is important to the company and its customers.

To my initial surprise, the position of an item in the list was not defined by an integer, but a string. This can be explained by the need of “sub-positions” that form a hierarchy of items, like in this example:

1 – basic item

1.1 – item upgrade #1

1.2 – item upgrade #2

Both positions “1.1” and “1.2” are positioned “underneath” position “1” and should be considered glued to it. If you move position “1” to position “4”, you also move 1.1 to 4.1 and 1.2 to 4.2.

But there was a strange formatting thing going on with the positions: They were stored as strings in the database, but with a strange padding in front. Instead of “1”, “2” and “3”, the entries contained the positions ” 1″, ” 2″ and ” 3″. All positions were prefixed with two space characters!

Well, nearly all positions. As soon as the list grew, the padding turned out to be dependent on the number of digits in the position: ” 9″, but then ” 10″ and “100”.

The reason can be found relatively simple: If you prefix with spaces (or most other characters, maybe “0”), your strings will be ordered in a numerical way. Without the prefixes, they would be sorted like “1”, “10”, “11”, “2”.

That means that the desired ordering of the positions is hardcoded in the database representation! You probably already thought about the case of a position greater than 999. That’s when trouble begins! Luckily, an invoice with a thousand items on the list is unheard of in the company (yet!).

Please note that while the desired ordering is hardcoded in the database, the items are still loaded in a different order (as they were entered into the system) and need to be sorted by the application. The default sorting for strings is the alphabetical order, so the original developer probably was clever/lazy, went with it and formatted the data in a way that would produce the result and not require additional logic during the sorting.

If you look at the code, you see seemingly strange formatting calls to the position all over the place. This is necessary because, for example, every time a user enters a position into the system, it needs to be reformatted (or at least sanitized) in order to adhere to the “auto-sortable” format.

If you wonder how a hierarchical sub-position looks like with this format, its ” 1. 1″, ” 1. 10″ and even ” 1. 17. 2. 4″. The database stores mostly blanks in this field.

While this approach might seem clever at the moment, it is highly dangerous. It conflates several things that should stay separated, like “storage format” and “display format”, “item order” or just “valid value range”. It is a clear violation of the “separation of concerns” principle. And it broke the application when I missed one place where the formatting was required, but not present. Of course, this only manifests in a problem when your test cases (or manual tries) exceed a list of 9 entries – lesson learnt here.

I dread the moment when the company calls to tell me about this “unusually large invoice” that exceeds the 999 limit. This would mean a reformatting of all stored data or another even more clever hack to circumvent the problem.

Did you encounter a format that was purely there for sorting in the wild? What was the story? Tell us in the comments!

How comments get you through a code review

Code comments are a big point of discussion in software development. How and where to use comments. Or should you comment at all? Is the code not enough documentation if it is just written well enough? Here I would like to share my own experience with comments.

In the last months I had some code reviews where colleagues looked over my merge requests and gave me feedback. And it happened again and again that they asked questions why I do this or why I decided to go this way.
Often the decisions had a specific reason, for example because it was a customer requirement, a special case that had to be covered or the technology stack had to be kept small.

That is all metadata that would be tedious and time-consuming for reviewers to gather. And at some point, it is no longer a reviewer, it is a software developer 20 years from now who has to maintain the code and can not ask you questions any more . The same applies if you yourself adjust the code again some time later and can not remember your thoughts months ago. This often happens faster than you think. To highlight how fast details disappear here is a current example: This week I set up a new laptop because the old one had a hardware failure. I did all the steps only half a year ago. But without documentation, I would not have been able to reconstruct everything. And where the documentation was missing or incomplete, I had to invest effort to rediscover the required steps.

Example

Here is an example of such a comment. In the code I want to compare if the mixer volume has changed after the user has made changes in the setup dialog.

var setup = await repository.LoadSetup(token);

var volumeOld = setup.Mixers.Contents.Select(mixer=>mixer.Volume).ToList();

setup = Setup.App.RunAsDialog(setup, configuration);

var volumeNew = setup.Mixers.Contents.Select(mixer=>mixer.Volume).ToList();
if (volumeNew == volumeOld)
{
     break;
}
            
ResizeToMixerVolume(setup, volumeOld);

Why do I save the volume in an additional variable instead of just writing the setup into a new variable in the third line? That would be much easier and more elegant. I change this quickly – and the program is broken.

This little comment would have prevented that and everyone would have understood why this way was chosen at the moment.

// We need to copy the volumes, because the original setup is partially mutated by the Setup App.
var volumeOld = setup.Mixers.Contents.Select(mixer=>mixer.Volume).ToList();

If you annotate such prominent places, where a lot of brain work has gone into, you make the code more comprehensible to everyone, including yourself. This way, a reviewer can understand the code without questions and the code becomes more maintainable in the long run.