Grammar as a leaky abstraction

Internationalisation, or i18n for short, is the process of making the user interface of a program ready for translation into multiple languages. This usually means to factor out texts from the program source code into separate files, often called translation bundles. These files have a key-value structure. The program code then only refers to keys that are resolved into the actual texts from the translation bundle for the selected target language.

Here’s a simple example of two translation bundles, one for English and one for German:

# translations_en.properties
quit_confirm_message=Do you really want to quit?
yes_option=Yes
no_option=No
# translations_de.properties
quit_confirm_message=Wollen Sie die Anwendung wirklich beenden?
yes_option=Ja
no_option=Nein

The actual source code might look like this:

var answer = showDialog(
  t("quit_confirm_message"),
  t("yes_option"),
  t("no_option")
);

Here the function t looks up the key in the currently active translation bundle and returns the translated message as a string.

How not to do it

Last month I discovered an amusing attempt at internationalisation in a third-party code base. It looked similar to this:

# translations_en.properties
a=A
an=An
is=is
not=not
available=available
article=article
# translations_de.properties
a=Ein
an=Ein
is=ist
not=nicht
available=verfügbar
article=Artikel

These translation keys were used like this:

"${t('an')} ${t('article')} ${t('is')}
${available ? '' : t('not')} ${t('available')}."

to produce messages like

"An article is available."
"An article is not available."

… or in German:

"Ein Artikel ist verfügbar."
"Ein Artikel ist nicht verfügbar."

Why is this a clumsy attempt at internationalisation?

Because it uses single words as translation units, and it relies on the fact that English and German have the same sentence structure in this particular case. In general, of course, languages do not have the same sentence structure, not even related languages like English and German.

The author of the code also introduced separate translation keys for “a” and “an”. The German translation for both keys was “ein”. The author was lucky so far that all texts with “a” or “an” in this particular program translated to “ein” in German, not “eine”, “einen”, “einem”, “einer”, or “eines”.

How to do it

So what would be the correct way to do it? The internationalisation should have looked like this:

# translations_en.properties
article_available=An article is available.
article_not_available=An article is not available.
# translations_de.properties
article_available=Ein Artikel ist verfügbar.
article_not_available=Ein Artikel ist not verfügbar.
available ? t("article_available")
          : t("article_not_available")

By using whole phrases and sentences as translation units the translations into various languages have the freedom to use their own word orders and grammatical structures.

Keeping in touch with your pipeline Jenkins jobs

We are using continuous integration (CI) at the Softwareschneiderei for many years now. Our CI platform of choice is historically Jenkins which was called Hudson back in the day.

Things moved on since then and the integration with GitLab got a lot better with the advent of multibranch pipeline jobs. This kind of job allows you to automatically build branches and merge requests within the same job and keep the builds separate.

Another cool feature of Jenkins is the job configuration as code, defined in Jenkinsfile and used in pipeline jobs. That way it is easy to create and maintain a job configuration alongside your project’s source code inside your repository. No need anymore to click through pages of web UIs to configure your job. That way you also get the complete job configuration change history as additional benefits.

I prefer using scripted instead of declarative pipelines for Jenkinsfiles because they give me more control, freedom and power. But like always, this power and flexiblity comes at a price…

Sending out build notifications

In my case I wanted to always send out build notification regardless of the job result. This is quite easy if you have plugins like the Mattermost Notification Plugin or one of the mail plugins. Since our pipeline script consists of Groovy code this seems quite straightforward: Put the notification code into a try-finally-block:

node {
    try {
        stage ('Checkout and build') {
            checkout scm
			// Do something to build our project
		}
		// Maybe some additional stages like testing, code-analysis, packaging and deployment
    } finally {
        stage ('Notify') {
            mattermostSend "${env.JOB_NAME} - ${currentBuild.displayName} finished with Status [${currentBuild.currentResult}] (<${env.BUILD_URL}|Open>)"
        }
    }
}

Unfortunately, this pipeline script will always return SUCCESS as the build result! Even if someone aborts the job execution or a stage in the try-block fails…

Managing build status

So the seasoned programmer probably already knows the fix: Setting the build result in appropriate catch-blocks:

node {
    try {
        stage ('Checkout and build') {
            checkout scm
			// Do something to build our project
		}
		// Maybe some additional stages like testing, code-analysis, packaging and deployment
    } catch (Exception e) {
        if (e in org.jenkinsci.plugins.workflow.steps.FlowInterruptedException) {
            currentBuild.result = 'ABORTED'
        } else {
            echo "Exception: ${e.class}, message: ${e.message}"
            currentBuild.result = 'FAILURE'
        }
    } finally {
        stage ('Notify') {
            mattermostSend "${env.JOB_NAME} - ${currentBuild.displayName} finished with Status [${currentBuild.currentResult}] (<${env.BUILD_URL}|Open>)"
        }
    }
}

You can control the granularity and the exceptions thrown by your build steps at will and implement exactly the status reporting that you want. The available statuses are defined in hudson.model.Result, so feel free to realize your own build status management to best fit your project.

From multiplayer Pac-Man to a twenty year old company

This blog post does not contain big insights. It’s just the story of the very first days of our company which happens to celebrate its 20th anniversary this month. And because most stories begins a lot earlier than when the narrator begins to tell them, I’ll try to tell this one from the start.

It starts with an eight-year old boy that has access to his very first personal computer, a Tandon 8088 with 8 MHz. Just to put this glorified pocket calculator in today’s perspective: A basic arduino board has more power. But back in the days, this personal computer was a magical tool that could act as all kinds of things, including a gaming machine. One of the first games on this machine was Pac-Man, in 80×25 character ASCII “graphics” and without any scoreboard or competitive element. It was strictly single player and the computer-controlled ghosts acted strictly by their algorithms, so it became a repetitive chore rather soon. The boy would play the usual route, add some new steps at the end and watch the ghosts react. After some time, the boy could predict the ghosts’ reactions and plan the new steps with accuracy, clearing level after level. The ghosts never adapted.

By the age of twelve, the boy knew that he would become a “computer engineer”. Every occupational counselor (two in total) advised against this decision, not because it was bad, but because the counselors didn’t know anything about the profession. But the boy sticked to his decision and began his studies in computer science immediately after school was over. This was in 1997, when the internet still made sounds and you could ruin an hour-long download just by picking up the phone.

The boy, now a young man far from home, studied basic computer science for six month until the semester break arrived. Most other students returned home, but he stayed and teamed up with other students still on campus. They planned to program a computer game. A pac-man game, but with multiplayer abilities. One team would be “the players” or “pac-men”, the other team would be “the ghosts”. If somehow there wouldn’t be a dozen human players in front of the keyboard, the computer would control the remaining avatars. Game controls worked with split keyboard and – planned for later versions – over the network.

The only way the students knew how to organize the project was to transform one room into a computer-ridden workshop and hack away. Every horizontal platform in the room became a desk. The project should happen in a span of 24 hours. Today, this would be called a “game jam“. After 24 hours, all we had was a map. No game, no players, nothing exciting – just the future game’s map. But we agreed to continue working until the game is finished.

It took the three students a whole week. A week without much sleep, slippy food and lots of source code. Because we didn’t know about version control yet (nobody told us and we didn’t set up a local network, anyway), we had to structure the code in way that would allow us to work on different parts without collisions and transfer them from computer to computer using floppy disks. We had to maintain a list of files that were modified and did so on a central whiteboard that the young man had bought at the beginning of his studies. This whiteboard became the planning area where we would keep track of our modifications, tasks and concepts, including the stereotypical post-it notes. In hindsight, you could call it a chaotic story board. Without the whiteboard, we probably would have failed.

But after the week, the game was finished. We had developed a multiplayer pac-man in Java, complete with graphics, sounds and multi-threading. It was playable! We named it “Hubert 2D”, a reference to both “Duke Nukem 3D”, a very 90s game, and to one of our most famous fellow students. The game was blazingly fast – so fast, in fact, that you often lost track of your avatar. The unofficial motto of the game turned out to be “where am i?”. It was crammed with features. Just a Pac-Man where you could gobble up little pills and evade the ghosts was not enough for us. First, there was Hubert, the boss ghost. He appeared randomly and could not be player-controlled. He had a rocket launcher. If you defeated Hubert, you could grab the rocket launcher and, well, launch rockets. How can you defeat a rocket-launching ghost in Pac-Man? With your chainsaw, evidently. Players could pick up chainsaws to defend themselves against the ghosts. Ghosts could pick up energy shields to defend themselves against the chainsaws. Players could place mines to blow up ghosts that didn’t pay attention. Ghosts could place bombs to create new passageways to evade the mines or blow up the players. Sheep wandered around cluelessly, being blown up by mines, bombs, chainsaws or rockets and generally acting like a mobile roadblock. Teleporters added to the confusion by instantly teleporting you to either another teleporter or a random place on the map (leading to the infamous “where am i?”). But above all, you could poison and heal other avatars with various potions. Taking everything into account, this wasn’t Pac-Man anymore. This was team deathmatch that lasted until all the pills on the map were gobbled up accidentally.

Two funny moments during development and testing (aka playing) will always stay in my memory:

  • You could poison an avatar, but also heal it with medicine. Being healed was indicated by an “hallelujah” sound effect. But, because every new avatar on the map would be created in the “healed” state, we had a serious “hallelujah” epidemic going on. It took us way longer than it should have to connect the dots and eliminate the sound effect during creation.
  • Every avatar on the map moved with the same speed. Some avatars like bombs or mines decided not to move at all, others like sheep and Hubert only moved sometimes, but rockets flew twice as fast. So it was not possible to outrun a rocket. Because of this imbalance in power, we deemed the Hubert boss to be invincible in close combat. You could not walk up to him without facing a rocket that reached you at least a tile before you could employ your chainsaw. We were proven wrong, when one player used an energy shield in combination with a chainsaw and a hallway corner to sneak up on Hubert, neutralize the first rocket with the energy shield and defeat Hubert with the chainsaw before the second rocket could be fired. Because we thought that Hubert would be invincible, this move didn’t gain any in-game points. But the moment turned legendary immediately.

This week of intensive teamwork, combined with the result of an actual game, provided us with the trust and groundworks for future collaboration. So it was no wonder that, just a few semesters later, we came up with the idea of selling this collaboration ability in the form of a software development company. We were more knowledgeable, better equipped and had trained working together multiple times. What better times than now?

So we founded our company, the Softwareschneiderei (“software tailoring”) in late 2000, twenty years ago. Because we really meant it to be earnest, we invested the money to create a limited liability company and had to learn all the topics and obligations that follow such a creation in a very short time. We were still studying at university, but working for our own company, in a rented office, in every free minute. Our primary goal was to finish our studies with a degree. Our secondary goal was to let the company survive long enough to make it the primary goal after graduation. The plan worked out and here we are, twenty years later.

The statistics says that only one out of ten companies survives their first five years. Even after that, keeping a company afloat is not sunshine sailing. Somehow, we made it. Despite all our mistakes and misconceptions (and there were many, most on a more serious level than deeming Hubert to be invincible), we developed our company in a way that provides benefit for our customers and profit for our employees.

And in a corner of my desk drawer, there is still a 3,5″ floppy disk labelled “Hubert 2D”. Because that’s the source code that got this company started, 23 years ago.

Bridging Eons in Web Dev with Polyfills

Indeed, web development is kind of peculiar. On the one hand, there‘s seldom a field in which new technologies overturn each other at that pace, creating very exciting opportunities ranging from quickly sketching out proof-of-concepts to the efficient construction of real-world applications. On the other hand, there is this strange air of browser dependency and with any new technology one acquires, there‘s always the question of whether this is just some temporary fashion or here to stay.

Which is why it hapens, that one would like to quickly scaffold a web application on the base of React and its ecosystem, but has the requirement that the customer is – either voluntarily or forced by higher powers – using some legacy browser like Internet Explorer 11, for which Microsoft has recently announced its end of life support for 30th November this year. Which doesn’t sound nice for the… *searching quickly* … 5% of desktop/laptop users that still use this old horse, but then again, how long can you cling to an outdated thing?

For the daily life of a web developer, his mind full of peculiarities that the evolution of the ECMAScript standard which basically is JavaScript, there is the practical helper of caniuse.com, telling you for every item of your code you want to know about, which browser / device has support and which doesn’t.

But what about whole frameworks? When I recently had my quest for a IE11-comptabile React app, I already feared that at every corner, I needed to double-check all my doing, especially given that for the development itself, one is certainly advised to instead use one of the browsers that come with a quite some helpful developer tools, like extensions for React, Redux, etc. — but also the features in the built-in Console, where it makes your life a lot easier whether you can just log a certain state as a string of “[Object object]” or a fully interactive display of object properties. Sorry IE11, there are reasons why you have to go.

But actually, then, I figured, that my request is maybe not that far outside the range of rather widespread use cases. Thus, the chance that someone already tried to tackle the problem, aren’t so hopeless. And so this works pretty straightforward:

  • Install “react-app-polyfill”, e.g. via npm:
npm install react-app-polyfill
  • At the very top of your index.js, add for good measure:
import "react-app-polyfill/ie11";
import "react-app-polyfill/stable";
  • Include “IE 11” (with quotes) in your package.json under the “browserlist” as a new entry under “production” and “development”

That should do it. There are people on the internet that advise removing the “node_modules/.cache” directory when doing this in an existing project.

The term of a polyfill is actually derived from some kind of putty, which is actually a nice picture. It’s all about allowing a developer to use accustomed features while maintaining the actual production environment.

Another very useful polyfill in this undertaking was…

// install 
npm install --save-dev @babel/plugin-transform-arrow-functions

// then add to the "babel" > "plugins" config array:
"babel": {
    "plugins": [
      "@babel/plugin-transform-arrow-functions"
    ]
  }

… as I find the new-fashioned arrow function notation quite useful.

So, this seems to bridge (most of) the worries one encounters in this web dev world where use cases span eons of technology evolution. Now, do you know any more useful polyfills that make your life easier?

React for the algebra enthusiast – Part 1

When I learned to use the react framework, I always had the feeling that it is written in a very mathy way. Since simple googling did not give me any hints if this was a consideration in the design, I thought it might be worth sharing my thoughts on that. I should mention that I am sure others have made the same observations, but it might help algebraist to understand react faster and mathy computer scientiests to remember some algebra.

Free monoids

In abstract algebra, a monoid is a set M together with a binary operation “\cdot” satisfying these two laws:

  • There is a neutral element “e”, such that: \forall x \in M: x \cdot e = e \cdot x = e
  • The operation is associative, i.e. \forall x,y,z \in M: x \cdot (y\cdot z) = (x\cdot y) \cdot z

Here are some examples:

  • Any set with exactly one element together with the unique choice of operation on it.
  • The natural numbers \mathbb{N}=\{0,1,2,\dots \} with addition.
  • The one-based natural numbers \mathbb{N}_1=\{1,2,3,\dots\} with multiplication.
  • The Integers \mathbb Z with addition.
  • For any set M, the set of maps from M to M is a monoid with composition of maps.
  • For any set A, we can construct the set List(A), consisting of all finite lists of elements of A. List(A) is a monoid with concatenation of lists. We will denote lists like this: [1,2,3,\dots]

Monoids of the form List(A) are called free. With “of the form” I mean that the elements of the sets can be renamed so that sets and operations are the same. For example, the monoid \mathbb{N} with addition and List({1}) are of the same form, witnessed by the following renaming scheme:

0 \mapsto []

1 \mapsto [1]

2 \mapsto [1,1]

3 \mapsto [1,1,1]

\dots

— so addition and appending lists are the same operation under this identification.

With the exception of \mathbb{N}_1, the integers and the monoid of maps on a set, all of the examples above are free monoids. There is also a nice abstract definition of “free”, but for the purpose at hand to describe a special kind of monoid, it is good enough to say, that a monoid M is free, if there is a set A such that M is of the form List(A).

Action monoids

A react-app (and by that I really mean a react+redux app) has a set of actions. An action always has a type, which is usally a string and a possibly empty list of arguments.

Let us stick to a simple app for now, where each action just has a type and nothing else. And let us further assume, that actions can appear in arbirtrary sequences. That means any action can be fired in any state. The latter simplification will keep us clear from more advanced algebra for now.

For a react-app, sequences of actions form a free monoid. Let us look at a simple example: Suppose our app is a counter which starts with “0” and has an increment (I) and decrement (D) action. Then the sequences of action can be represented by strings like

ID, IIDID, DDD, IDI, …

which form a free monoid with juxtaposition of strings. I have to admit, so far this is not very helpful for a practitioner – but I am pretty sure the next step has at least some potential to help in a complicated situation:

Quotients

Quotients of sets by an equivalence relation are a very basic tool of modern math. For a monoid, it is not clear if a quotient of its underlying set will still be a monoid with the “same” operations.

Let us look at an example, where everything goes well. In the example from above, the counter should show the same integer if we decrement and then increment (or the other way around). So we could say that the two action sequences

  • ID and
  • DI

do really nothing and should be considered equivalent to the empty action sequence. So let’s say that any sequence of actions is equivalent to the same sequence with any occurence of “DI” or “ID” deleted. So for example we get:

IIDIIDD \sim I

With this rule, we can reduce any sequence to an equivalent one that is a sequence of Is, a sequence of Ds or empty. So the quotient monoid can be identified with the integers (in two different ways, but that’s ok) and addition corresponds to juxtaposition of action sequences.

The point of this example and the moral of this post is, that we can take a syntactic description (the monoid of action sequences), which is easy to derive from the source code and look at a quotient of the action monoid by a reasonable relation to arrive at some algebraic structure which has a lot to do with the semantic of the app.

So the question remains, if this works just well for an example or if we have a general recipe.

Here is a problem in the general situation: Let x,y,z\in M be elements of a monoid M with operation “\cdot” and \sim be an equivalence relation such that x is identified with y. Then, denoting equivalence classes with [\_] it is not clear if [x] \cdot [y] should be defined to be [x\cdot z] or [y\cdot z].

Fortunately problems like that disappear for free monoids like our action monoid and equivalence relations constructed in a specific way. As you can see on wikipedia, it is always ok to take the equivalence relation generated by the same kind of identifications we made above: Pick some pairs of sequences which are known “to do the same” from a semantic point of view (like “ID” and “DI” did the same as the empty sequence) and declare sequences to be equivalent, if they arise by replacing sequences known to be the same.

So the approach is that general: It works for apps, where actions do not have parameters and can be fired in any order and for equivalence relations generated by defining finitely many action sequences to do the same. The “any order” is a real restriction, but this post also has a “Part 1” in the title…

Crashes when returning references to vector elements

Recently, I was experiencing a strange crash that I traced to a piece of C++ code looking more or less like this:

template <class T>
class container
{
public:
  std::vector<T> values_;
  T default_;

  T const& get() const
  {
    if (values_.empty())
      return default_;
    return values.front();
  }
};

This was crashing when calling get(), with a non-empty values_ member. It looks fairly innocent. And it ran in production for a couple of years already. So what changed?

I had, in fact, never instanciated this template with T = bool before. And that was causing the crash, while still compiling without any errors. Now if you’re a little versed in the C++ standard library you might know that std::vector is a special snowflake indeed. In an effort to save space, and, I suspect, prove the usefulness of template specializations, it is not really a “normal” container holding bool values. Instead, it holds some type of integers and packs each pseudo-bool into one of their bits. The consequence is that the accessor functions like operator[], front() and back() cannot return a reference to a bool. Instead, they return a “proxy” object that supports assignment to and from a bool.

Back to the get() function: it tries to return a reference to a bool. Of course, that bool doesn’t really exist except as a temporary, and so this results in a dangling reference that causes a segmentation fault when used.

I suspect there could have been a warning about a dangling reference somewhere there. I have seen clang-tidy especially report things like this (with a few false positives too), but it did not show up for me. To fix it, I am now just returning a bool instead of a bool const& for T = bool. A special case in my case to work around a special case in std::vector.

Contiguous date ranges in Oracle SQL

In one of my last posts from a couple of weeks ago I wrote about querying gaps between non-contiguous date ranges in Oracle SQL. This week’s post is about contiguous date ranges.

While non-contiguous date ranges are best represented in a database table with a start_date and an end_date column, it is better to represent contiguous date ranges only by one date column, so that we avoid redundancy and do not have to keep the start date of a date range in sync with the end date of the previous date range. In this post I will use the start date:

CREATE TABLE date_ranges (
name VARCHAR2(100),
start_date DATE
);

The example content of the table is:

NAME	START_DATE
----	----------
A	05/02/2020
B	02/04/2020
C	16/04/2020
D	01/06/2020
E	21/06/2020
F	02/07/2020
G	05/08/2020

This representation means that the date range with the most recent start date does not have an end. The application using this data model can choose whether to interpret this as a date range with an open end or just as the end point for the previous range and not as a date range by itself.

While this is a nice non-redundant representation, it is less convenient for queries where we want to have both a start and an end date per row, for example in order to check wether a given date lies within a date range or not. Luckily, we can transform the ranges with a query:

SELECT
date_ranges.*,
LEAD(date_ranges.start_date)
OVER (ORDER BY start_date)
AS end_date
FROM date_ranges;

As in the previous post on non-contiguous date ranges the LEAD analytic function allows you to access the following row from the current row without using a self-join. Here’s the result:

NAME	START_DATE	END_DATE
----	----------	--------
A	05/02/2020	02/04/2020
B	02/04/2020	16/04/2020
C	16/04/2020	01/06/2020
D	01/06/2020	21/06/2020
E	21/06/2020	02/07/2020
F	02/07/2020	05/08/2020
G	05/08/2020	(null)

By using a WITH clause you can use this query like a view and join it with the another table, for example with the join condition that a date lies within a date range:

WITH ranges AS
(SELECT date_ranges.*, LEAD(date_ranges.start_date) OVER (ORDER BY start_date) AS end_date FROM date_ranges)
SELECT timeseries.*, ranges.name
FROM timeseries LEFT OUTER JOIN ranges ON
timeseries.measurement_date
BETWEEN ranges.start_date AND ranges.end_date;

Co-Variant methods on C# collections

C# offers a powerful API for working with collections and especially LINQ offers lots of functional goodies to work with them. Among them is also the Concat()-method which allows to concatenate two IEnumerables.

We recently had the use-case of concatenating two collections with elements of a common super-type:

class Animal {}
class Cat : Animal {}
class Dog : Animal {}

public IEnumerable<Animal> combineAnimals(IEnumerable<Cat> cats, IEnumerable<Dog> dogs)
{
  // XXX: This does not work because Concat is invariant!!!
  return cats.Concat(dogs);
}

The above example does not work because concat requires both sequences to have the same type and returns a combined sequences of this type. If we do not care about the specifities of the subclasses be can build a Concatenate()-method ourselves which make the whole thing possible because instances of both subclasses can be put into a collection of their common parent class.

private static IEnumerable<TResult> Concatenate<TResult, TFirst, TSecond>(
  this IEnumerable<TFirst> first,
  IEnumerable<TSecond> second)
    where TFirst: TResult where TSecond : TResult
{
  IList<TResult> result = new List<TResult>();
  foreach (var f in first)
  {
    result.Add(f);
  }
  foreach (var s in second)
  {
    result.Add(s);
  }
  return result;
}

The above method is a bit clunky to call but works as intended:

public IEnumerable<Animal> combineAnimals(IEnumerable<Cat> cats, IEnumerable<Dog> dogs)
{
  // Works great!
  return cats.Concatenate<Animal, Cat, Dog>(dogs);
}

A variant of the above is a Concatenate()-method can be useful if you use a collection of the parent class to collect instances of subclass collections:

private static IEnumerable<TResult> Concatenate<TResult, TIn>(
  this IEnumerable<TResult> first,
  IEnumerable<TIn> devs)
    where TIn : TResult
{
  IList<TResult> result = first.ToList(); 
  foreach (var dev in devs)
  {
    result.Add(dev);
  }
  return result;
}

public IEnumerable<Animal> combineAnimals(IEnumerable<Cat> cats, IEnumerable<Dog> dogs)
{
  IEnumerable<Animal> result = new List<Animal>();
  result = result.Concatenate(cats);
  return result.Concatenate(dogs);
}

Maybe the above examples can serve as an inspiration for more utility methods that may improve working with collections in C#.

Make yourself comfortable

If you have to add a new feature to an existing code base, you’ve likely already experienced an uncomfortable truth: Nobody has thought about your use case. Nothing in the existing code base fits your goals. This isn’t because everybody wanted you to fail, but because your new feature is in fact brand new to the software and responsible software developers stop working on a code as soon as their work is done (while obeying the project’s definition of done).

So you try to shoehorn your functionality into the existing code. It’s not neat, but you get it working. Are you done? In my opionion, you haven’t even started yet. Your first attempt to combine your idea of the best implementation of your functionality and the existing system will be cumbersome and painful. Use it as a learning experience how the system behaves and throw it away once you get a working prototype. Yes, you’ve read that right. Throw it away as in undo your commits or changes. This was the learning/exploration phase of your implementation. You’ve applied your idea to reality. It didn’t work well. Now is the time to apply reality to your idea. Commence your second attempt.

For your second attempt, you should make use of your refactoring skills on the existing code. Bend it to your anticipated and tried needs. And once the code base is ready, drop your new feature into the new code “socket”. Your work doesn’t need to be cumbersome and painful. Make yourself comfortable, then make it work.

Here is a example, based on a real case:

An existing system was in development for many years and worked with a lot of domain objects. One domain object was a price tag that looked something like this:

interface PriceTag {
    PriceCategory category();
    TaxGroup taxGroup();
    Euro nettoAmount();
    Product forProduct();
}

Well, it was a normal domain object giving back other normal domain objects. The new feature should be an audio module that could read price tags out loud. The team used a text-to-speech synthesizing library that takes a string and outputs an audio stream. No big deal and pretty independent from the already existing code base.

But the code that takes a price tag and converts it into a string, aka the connection point between the unbound library code and the existing system, was ugly and undecipherable:

String priceTagToText(PriceTag price) {
    return price.forProduct().getDenotation()
        + " for only "
        + CurrencyFormatter.format(price.nettoAmount())
        + " with "
        + String.valueOf(price.taxGroup().percentage())
        + " % VAT in the "
        + price.category().getDenotation()
        + " section.";
}

This is how it looks if somebody tries to combine two building blocks that aren’t meant for each other. To test this method, you’ll have to mock deep into the domain objects.

If two building blocks aren’t matching naturally, maybe it’s an idea to add some lubrication code between them. This code isn’t exactly doing anything newfound, but adds a requirement seam that point towards the existing system:

interface ReadablePriceTag {
    String denotation();
    String netto();
    String vatPercentage();
    String category();
}

You can probably already see where this is heading. Just in case you cannot, I will take you through all parts of the code.

First we can write a priceTagToText() method that reads a lot nicer:

String priceTagToText(ReadablePriceTag price) {
    return price.denotation()
        + " for only "
        + price.netto()
        + " with "
        + price.vatPercentage()
        + " VAT in the "
        + price.category()
        + " section.";
}

The second and complementary part is the implementation of the ReadablePriceTag interface that is given a PriceTag object and translates the data for the new methods:

class PriceTagBasedReadablePriceTag {
    private final PriceTag price;

    PriceTagBasedReadablePriceTag(PriceTag price) {
        this.price = price;
    }

    @Override
    public String denotation() {
        return this.price.forProduct().getDenotation();
    }

    @Override
    public String netto() {
        return CurrencyFormatter.format(this.price.nettoAmount());
    }

    @Override
    public String vatPercentage() {
        return String.valueOf(
                this.price.taxGroup().percentage()) + " %";
    }

    @Override
    public String category() {
        return this.price.category().getDenotation();
    }
}

Basically, you have a lot of existing code that is using PriceTag objects and some new code that wants to use ReadablePriceTag objects. The PriceTagBasedReadablePriceTag class is the connector between both worlds (at least in one direction). We can definitely argue about the name, but that’s a detail, not the main point. The main points of all this effort are two things:

  1. The new code does not suffer in quality and readability from decisions made at a different time in a different context.
  2. The code clearly models these contexts. If you are aware of Domain Driven Design, you probably see the “Bounded Context” border that crosses right between PriceTag and ReadablePriceTag. The PriceTagBasedReadablePriceTag class is the bridge across that border.

If you express your context borders explicitely like in this example, your code reads fine on any side of the border. There is no notion of “old and fitting” and “new and awkward” code. It seems like additional work and it surely is, but is pays off in the long run because you can play this game indefinitely. A code base that gets more muddied and forced with time will reach a breaking point after which any effort needs knowledge in archeology and cryptoanalysis.

So, my advice boils down to one thing: Make yourself comfortable when adding new code to an existing code base.
And then, think about your type names. PriceTagBasedReadablePriceTag is most likely not the best name for it. But that’s a topic for another blog post. What would be your name for this class?

What is it with Software Development and all the clues to manage things?

As someone who started programming a long time ago (roughly 20 years, now that I think about it), but only in recent years entered the world of real software development, the mastery of day-to-day-challenges happens to consist of two main topics: First, inour rapidly evolving field we never run out of new technologies to learn, and then, there’s a certain engineering aspect underlying, how to do things in a certain manner, with lots of input every year.

So after I recently shared some of such ideas with my friends — I indeed still have a few ones of those), I wondered: How is it, that in the modern software development world, most of the information about managing things actually comes from the field itself, and rather feeding back its ideas of project management, quality, etc. into the non-software-subspaces of the world? (Ideas like the Agile movement, Software Craftmanship, the calls of doing things Lean and Clean, nowadays prospered so much that you see their application or modification in several other industries. Like advertisement, just as an example.)

I see a certain kind of brain food in this question. What tells software development apart from other current fields, so that there is a broad discussion and considerable input at its base level? After all, if you plan on becoming someone who builds houses, makes cars, or manages cities, you wouldn‘t engage in such a vivid culture of „how“ to do things, rather focussing on the „what“.

Of course, I might be mistaken in this view. But, by asking: what actually tells software development apart from these other fields of producing something, I see a certain kind of brain food, helpful for approaching every day tasks and valuing better tips over worse ones.

So, what can that be?

1. Quite peculiar is the low entry threshold in being able to call yourself a “programmer”. With the lots of resources you get at relatively little cost (assumption: you have a computer with a working internet connection), you have a lot of channels by which you can learn the „what“ of software development first, and saving the „how“ for later. If you plan on building a house, there‘s not a bazillion of books, tutorials, and videos, after all.

2. Similarly, there‘s the rather low cost of failure when drafting a quick hobby project. Not always will a piece of code that you write in your free time tell yourself „hey man, you ever thought about some better kind of architecture?“ – which is, why bad habits can stick and even feel right. If you choose the „wrong“ mindset, you don‘t always lose heaps of money, and neither do you, if you switch your strategy once in a while, you also don‘t automatically. (you probably will, though, if you are too careless in this process).

3. Furthermore, there‘s the dynamic extension of how your project is going to be used („Scope Creep“). One would build a skyscraper in a different way than a bungalow (I‘m not an expert, though), but with software, it often feels like adding a simple feature here, extending the scope there, unless you hit a point where all its interdependencies are in a complex state of conflict…

4. Then, it‘s a matter of transparency: If you sit in a badly designed car, it becomes rather obvious when it always exhausts clouds of black smoke. Or your house always smells like scents of fresh toilet. Of course, a well designed piece of software will come with a great user experience, but as you can see in many commercial products, there also is quite some presence of low-than-average-but-still-somehow-doing-what-it-should software. Probably users are more tolerant with software than with cars?

5. Also, as in most technical fields, it is not the case that „pure consultants“ are widely received in a positive light. For most nerds, you don‘t get a lot of credibility if you talk about best practices without having got your hands dirty over a longer period of time. Ergo, it needs some experienced software programmers in order to advise less experienced software programmers… but surely, it‘s questionable whether this is a good thing.

6. After all, the requirements for someone who develops a project might be very different in each field. From my academical past in computational Physics, I know that there is quite some demand for „quick & dirty“ solutions. Need to add some Dark Matter in your model here? Well, plug this formula in and check the results. Not every user has the budget or liberty in creating a solid structure of your program. If you want to have a new laboratory building, of course, you very well want it it do be designed as good as it can get.

All in all, these observations somehow boil down to the question, whether software development is to be seen more like a set of various engineering skills, rather like a handcraft, an art, or a complex program of study. It is the question, whether the “crack” in this field is the one who does complex arithmetics in its head, or the one who just gets what the customer wants. I like thinking about such peculiar modes of thought, as they help me in understanding what kinds of things I should learn next.

Or is there something completely else to it?