Don’t test details from a distance

The concept described in this blog entry has evoked a lot of different metaphors and descriptions from our team when we discussed it. So don’t take my words or thoughts on it as the one true way to talk about it – the concept of the “testing gap” or the distance between the code under test and the test’s vantage point.

Before I describe my metaphor for it with some weird visuals, let’s look at some code:

public Budget(String denotation, int maximumHours) {
    this.denotation = denotation;
    this.maximumHours = maximumHours;
    this.currentHours = maximumHours;
}

This is the constructor for an entity, a domain class that represents a budget of work hours that gets slowly used up when you work for the customer’s project. There is not much going on in this code except one little detail of the domain: New budgets always start fully “filled up”, in that the currentHours are set to the maximumHours. You can’t create a budget that is already half empty with this code.

Such a domain concept or “business rule” requires a test that ensures it is still in place:

@Test
public void has_initially_current_hours_set_to_maximum() {
    Budget target = new Budget(
        "current is maxed",
        100
    );
    assertThat(target.getMaximumHours()).isEqualTo(100);
    assertThat(target.getCurrentHours()).isEqualTo(100);
}

This is a fairly boring unit test that ensures that freshly created budgets have all their working hours still available.

In our example, the entity lives in the core of a web application that provides an endpoint to create new budgets. We have a test for the endpoint, of course:

@Test
public void stores_new_budget() throws Exception {
    this.web.perform(
        post("/budgets")
        .contentType(MediaType.APPLICATION_JSON)
        .content("{\"denotation\": \"new budget\", \"maximumHours\": 300}")
    )
    .andExpect(status().isOk())
    .andExpect(content().json("{\"denotation\": \"new budget\", \"maximumHours\": 300, \"currentHours\": 300}"));
}

You can shudder at the code formatting or the necessity to escape your JSON data into inscrutability. At least the second problem more or less disappears with current Java versions. But that’s not the point today. The point is that this is effectively the same test as above, but with a gap in between.

If you wrote just the second test, your code coverage metrics would probably not decrease. Your business rule would still be tested. So why write the first test if it adds nothing to the safety net?

This can be explained with the idea that there is a considerable “testing gap” between the second test and the business rule. It covers the entity’s constructor code and states explicitely that the currentHours property should be set to the same value as the maximumHours property. But it also defines the communication protocol as being HTTP, the data format as being JSON and travels through code that finds an “endpoint” for the given URL, maps the given JSON to a constructor call and serializes the resulting object as JSON back to the requester. That would be a lot of padding just to test the constructor’s third line.

The first test has virtually no testing gap. It knows nothing about the web, data formats or whatever else the application consists of. It just looks at the entity and its behaviour in isolation.

There are perfectly valid reasons to write the second test, but it should not be the only test that ensures the business rule in our example. The second test “sees too much” from its vantage point to pay attention to a little detail like the business rule.

In case you didn’t quite get the concept of the testing gap yet, here is how I imagine it in my head: If your code under test is a mystery box (really try to picture a shoebox made of cardboard that rattles when you move it), then your test is a big floating eye that uses little cracks and holes in the box to get a quick peek inside. If you exhibit state by getter methods like in our example, the eye ensures the internal state of the box by looking at the gauges that are placed on its sides.

If your testing gap is small, the eye hovers up close to the box. It doesn’t see anything else, but it notices every detail of the box.

If you have some testing gap in your test code, the eye is placed in a considerate distance from the mystery box. There are other important things between them. The gauges aren’t directly readable. The eye uses indirect clues and reflections to gather its informations. Every time something in the gap’s setup changes, the testing eye needs to adjust its gaze.

Which brings us to the conclusion of this metaphor: If you want things to be looked at in detail, write tests without a testing gap. Otherwise, your tests will have increased execution times, exhibit a strange imprecision in their message (“something in these dozen of classes has changed and it might not even be relevant”) and require frequent adjustments that are not related to their testing story.

Or, if said with the words of my imagination, place your testing eye directly at the entrance of your test’s hideout.

You’ve probably thought about this concept already, in your own terms and metaphors. Can you try to describe it in a comment? Just for the name, we discussed “testing distance”, “testing height”, “testing gap” and others. Perhaps we like your description even better.

A Purpose of Domain-Driven-English-German-Language-Mumbo-Jumbo

Disclaimer: Due to it’s nature, this blog article needs to make some use of the German language. This is part of its essence and could not be avoided, sorry to all international readers.

Since its conception in 2003, the expression “Domain-Driven Design” might have been tossed around a bit, together with all the other XYZ-Driven Designs that are out there. As usual with such terms, I only try to gather the core points of these ideas; I do not like sticking to any such concept with religious fervor or otherwise dogmatic understanding. Moreover, these concepts are usually not of the type “you either use them or you don’t”, but you have some control over the degree in which you employ them, depending on your requirements as a whole.

This is why in a new project, I might implement a handful of ideas and see where it goes, always prepared to call it a day and toss any rule out when it endangers my progress. On the other hand, if I only follow principles that instantly convince me, I risk missing out on some practice that just is unusual, but not bad in itself.

Domain-Driven Design, in my understanding, aims at aligning the architectural details of your code base with the domain model, i.e. the technical peculiarities of your (customer’s) specific use case. Which doesn’t sound hard or bad per se, but as usual, takes some practice to shed some light on.

Enter the idea of using German words in your code. For variables, methods, classes, and such stuff – even with Umlauts and the Eszett (“ß”). If one is not used to that, such code might instantly induce some sort of digestive sickness or at least that’s what it has done to me, because of it’s sheer look, i.e.

// just some example to look at

var sortedZuordnungen = szenario.SortedZeitplanForArbeitsplatz(arbeitsplatz.Id)
.ToList();
var gesperrteHalbtage = sperrungen.Where(s => s.AufArbeitsplatz(arbeitsplatz.Id)).Select(s => s.Halbtag);

var nächsteZuordnung = sortedZuordnungen.FirstOrDefault();
Halbtag tryStart = Constants.HeuteVormittag;

while (nächsteZuordnung != default)
{
    tryStart.CreateListFromHere(anzahlHalbtage, gesperrteHalbtage);
    nächsteZuordnung = FindNächsteZuordnung();
}

(replace “German” with any other language your customer might use; if you’re living in a completely English-speaking environment, this article should be of limited insight for you. Sorry again.)

Now code like this – at first – what is this!? That’s not proper! It looks like the sound of some older German politician who never really bothered learning the English language, with some crazy dialect and whatnot!

The advantage behind this concept becomes especially apparent when dealing with a lot of very generic terms. E.g. the word “component” might just mean a button on your UI, or it might mean something very specific for your customer – or even worse, you might mean something very specific for your customer, but in reality, he would never refer to that entity with that word, so… you’re left with a chance of awkward bewilderment in every single meeting with the guy.

So, despite it’s weird look – this is one of the concepts that I haven’t tossed out the window yet. The key point is the overall reduction of friction in your thoughts. In communicating with various languages, one always has to do some minor translations in your head. These can be faulty or misleading either way – the nature of the language itself is secondary.

What works for me, is

  • Pure code fabrications that are close to the programming language get English names like usual
  • Things that a customer might talk about in German should get a German name
  • German and English can be mixed in a single word without any shame
  • Thus, words can be long, but you have an IDE who can help with that
  • German compound words get the correct German capitalization, i.e. the equivalent of “componentNumber” would be “komponentennummer”, not “komponentenNummer”
  • The linking of two German parts happens with the correct grammatical standards, i.e. a “workPlace” becomes an “arbeitsplatz” with the “s” inbetween (Fugen-s).

For some reason, this by now resulted in quite an uninterrupted workflow for me. The last two rules were an interesting finding because I noticed that without them, I really made a noticeable pause in my thinking process whenever I thought about these entities. This pause is now gone.

E.g. by now, the cognitive load of talking about a “KomponentenController” – something that is a Controller from a software engineering point of view and dealing with components from a domain point of view, appears easier for me than having to talk about a “ComponentController” with the extra translation of Component and Komponente. Mind you, there are enough words that do not sound that similar in our two languages.

I will not use this concept in every single project I might start from now. I.e. for hobby projects (where I’m my own customer), I would still prefer the 100%-English-language solution. But depending on your project, this is worth a try, and I’m positively amazed on how well that can work.

Improved automated instance construction in C++

In my last blog post, I wrote about how I am automatically deducing constructor parameters in my dependency injection container. The approach had a major drawback: It worked only for 2 or more parameters, since there was an ambiguity with copy- or move-constructors with exactly one parameter.

Right after I wrote that post, I actually found a solution to that problem in the Boost.DI FAQ, which explains how to do that in its CPPnow 2016 slides. It restricts the type conversion operator by using SFINEA on an unused template parameter. I did not even know that was possible! It defines the templated conversion operator very similar to this:

template <class T,
  class = std::enable_if_t<!std::is_same<std::remove_cvref_t<T>, Exclude>{}>>
operator T&() const
{
  return p_->get<std::remove_cvref_t<T>>();
}

Since this is a bit more involved than the bare templated conversion operator from last time, repeating it would be bad. In the last version, I used 3 helper types, the inferred_locator, mimic and the provider_wrapper, but that can be streamlined into one class:

template <typename Exclude> struct mimic
{
  mimic(std::size_t)
  {
  }

  mimic(service_provider const& p, std::size_t)
  : p_(&p)
  {
  }

  template <class T, class = std::enable_if_t<!std::is_same<std::remove_cvref_t<T>, Exclude>{}>> operator T&() const
  {
    return p_->get<std::remove_cvref_t<T>>();
  }

  service_provider const* p_{ nullptr };
};

Note that is uses some unused extra size_t parameters, which make the parameter expansion easier in the next step. Now can use that for the SFINEA in the recursive construction:

// Actual dependency injection..
template <class T, std::size_t Head, std::size_t... Rest> constexpr auto
make_injected_(service_provider const& p, std::index_sequence<Head, Rest...>,
    decltype(T{ mimic<T>{ Head }, mimic<T>{ Rest }... }) * = nullptr)
{
  return std::make_unique(mimic<T>(p, Head), mimic<T>(p, Rest)...);
}

// Trivial no-dependency case
template <class T> constexpr auto
make_injected_(service_provider const& p, std::index_sequence<>)
{
  return std::make_unique<T>();
}

// Fallback to try with fewer parameters
template <class T, std::size_t... Rest> constexpr auto make_injected_(service_provider const& p, std::index_sequence<Rest...>)
{
  return make_injected_<T>(p, std::make_index_sequence<sizeof...(Rest) - 1>{});
}

template <class T, std::size_t Max = 16> auto
make_injected(service_provider const& p)
{
  return make_injected_<T>(p, std::make_index_sequence<Max>{});
}

Just after I found this solution, my former colleague Dirk Reinbach sent me a very neat C++20 variant to restrict the conversion operator via a concept:

template <typename T, typename U>
concept not_is_same = !std::is_same_v<std::remove_cvref_t<T>, std::remove_cvref_t<U>>;

template <typename Exclude> struct mimic
{
  /* other members... */
  template <not_is_same<Exclude> T> operator T&() const
  {
    return p_->get<std::remove_cvref_t<T>>();
  }
};

This works just as well, and is more readable, too. I have not measured, but I guess it’s probably also faster to compile, since all things SFINEA are notoriously slow.

PostgreSQL’s “DISTINCT ON” clause

Anyone who uses SQL databases knows the DISTINCT modifier for SELECT queries to get result sets without duplicates. However, PostgreSQL has another variant of it that not everyone knows, but which is very useful: the SELECT DISTINCT ON clause. It can be used to query only the first row of each set of rows according to a grouping.

To understand its usefulness, let’s look at an example and solve it in the classical way first.

The complicated way

Given the following table of items we want to query for each category the item with the highest value.

 name │ category │ value
-------------------------
 A    │ X        │ 52
 B    │ X        │ 35
 C    │ X        │ 52
 D    │ Y        │ 27
 E    │ Y        │ 31
 F    │ Y        │ 20

Usually we’d start out with a query like this:

SELECT
  category,
  MAX(value) AS highest_value
FROM items
GROUP BY category;
category │ highest_value
--------------------------
 X       │ 52
 Y       │ 31

And then use this query as a sub-select:

SELECT * FROM items
WHERE (category, value) IN (
  SELECT
    category,
    MAX(value) AS highest_value
  FROM items
  GROUP BY category
);
 name │ category │ value
-------------------------
 A    │ X        │ 52
 C    │ X        │ 52
 E    │ Y        │ 31

Unfortunately, there are multiple items in category X with the same highest value 52. But we really only want one row for each category. In this case we might use the ROW_NUMBER() function:

SELECT
  name, category, value
FROM (
  SELECT
    items.*,
    ROW_NUMBER() OVER (
      PARTITION BY category
      ORDER BY value DESC, name
    ) AS rownum
  FROM items
) WHERE rownum = 1;
 name │ category │ value
-------------------------
 A    │ X        │ 52
 E    │ Y        │ 31

This is finally our desired result.

The easy way

But I promised it can be easier with the DISTINCT ON clause. How does it work?

SELECT DISTINCT ON (category) *
FROM items
ORDER BY
  category, value DESC, name;

After DISTINCT ON we specify one or more columns by which to group by in parentheses. The ORDER BY clause determines which row will be the first in each group. We get the same result:

 name │ category │ value
-------------------------
 A    │ X        │ 52
 E    │ Y        │ 31

Running a containerized ActiveDirectory for developers

If you develop software for larger organizations one big aspect is integrating it with existing infrastructure. While you may prefer simple deployments of services in docker containers a customer may want you to deploy to their wildfly infrastructure for example.

One common case of infrastructure is an Active Directory (AD) or plain LDAP service used for organization wide authentication and authorization. As a small company we do not have such an infrastructure ourselves and it would not be a great idea to use it for development anyway.

So how do you develop and test your authentication module without an AD being available for you?

Fortunately, nowadays this is relatively easy using tools like Docker and Samba. Let us see how to put such a development infrastructure up and where the pitfalls are.

Running Samba in a Container

Samba cannot only serve windows shares or act as an domain controller for Microsoft Windows based networks but includes a full AD implementation with proper LDAP support. It takes a small amount of work besides installing Samba in a container to set it up, so we have two small shell scripts for setup and launch in a container. I think most of the Dockerfile and scripts should be self-explanatory and straightforward:

Dockerfile:

FROM ubuntu:20.04

RUN DEBIAN_FRONTEND=noninteractive apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install samba krb5-config winbind smbclient 
RUN DEBIAN_FRONTEND=noninteractive apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install iproute2
RUN DEBIAN_FRONTEND=noninteractive apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install openssl
RUN DEBIAN_FRONTEND=noninteractive apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install vim

RUN rm /etc/krb5.conf
RUN mkdir -p /opt/ad-scripts

WORKDIR /opt/ad-scripts

CMD chmod +x *.sh && ./samba-ad-setup.sh && ./samba-ad-run.sh

samba-ad-setup.sh:

#!/bin/bash

set -e

info () {
    echo "[INFO] $@"
}

info "Running setup"

# Check if samba is setup
[ -f /var/lib/samba/.setup ] && info "Already setup..." && exit 0

info "Provisioning domain controller..."

info "Given admin password: ${SMB_ADMIN_PASSWORD}"

rm /etc/samba/smb.conf

samba-tool domain provision\
 --server-role=dc\
 --use-rfc2307\
 --dns-backend=SAMBA_INTERNAL\
 --realm=`hostname`\
 --domain=DEV-AD\
 --adminpass=${SMB_ADMIN_PASSWORD}

mv /etc/samba/smb.conf /var/lib/samba/private/smb.conf

touch /var/lib/samba/.setup

Using samba-ad-run.sh we start samba directly instead of running it as a service which you would do outside a container:

#!/bin/bash

set -e

[ -f /var/lib/samba/.setup ] || {
    >&2 echo "[ERROR] Samba is not setup yet, which should happen automatically. Look for errors!"
    exit 127
}

samba -i -s /var/lib/samba/private/smb.conf

With the scripts and the Dockerfile in place you can simply build the container image using a command like

docker build -t dev-ad -f Dockerfile .

We then run it like follows and use the local mounts to preserve the data in the AD we will be using for testing and toying around:

 docker run --name dev-ad --hostname ldap.schneide.dev --privileged -p 636:636 -e SMB_ADMIN_PASSWORD=admin123! -v $PWD/:/opt/ad-scripts -v $PWD/samba-data:/var/lib/samba dev-ad

To have everything running seamlessly you should add the specified hostname – ldap.schneide.dev in our example – to /etc/hosts so that all tools work as expected and like it was a real AD host somewhere.

Testing our setup

Now of course you may want to check if your development AD works as expected and maybe add some groups and users which you need for your implementation to work.

While there are a bunch of tools for working with an AD/LDAP I found the old and sturdy LdapAdmin the easiest and most straightforward to use. It comes as one self-contained executable file (downloadable from Sourceforge) ready to use without installation or other hassles.

After getting the container and LdapAdmin up and running and logging in you should see something like this below:

LdapAdmin Window showing our Samba AD

Then you can browse and edit your active directory to fit your needs allowing you to develop your authentication and authorization module based on LDAP.

I hope you found the above useful for you development setup.

What dependent types can do for you

In a way, this post is also about Test Driven Developement and *Type* Driven Developement. While the two share the same acronym, I always thought of them as different concepts. However, as I recently experienced, when the two concepts are used in a dependently typed language, there is something like a fluid transition between them.

While I will talk about programming in the dependently typed language Agda, not much is needed to follow what is going on – I will just walk through an exercise and explain everything along the way.

The exercise I want to use, is here. It talks about a submarine, its position and certain commands, that change the position. Examples for commands are forward 1, down 2 and up 3. These ‘values’ can be used just like that with the following definition of the type of commands:

      data Command : Set where
        forward : Nat -> Command
        up      : Nat -> Command
        down    : Nat -> Command

Agda can be used in a very mathy way – this should really be read as saying, that the type of commands is a Set and there are three constructors (highlighted green) which take a natural number as argument and produce a command. So, using that application is just juxtaposition, we can make the following definitions now:

  justSomeCommand = forward 5
   anotherOne = up 1

Now the exercise text explains, how these commands can be applied to the position of the submarine. Working as a software developer, I built the habit of turning specifications like that into tests. Since I don’t know any better, I just wrote ‘tests’ in Agda using equations to translate the exercise text – I’ll explain the syntax below:

  apply (forward 5) (pos 0 0) ≡ pos 5 0
  apply (down 5) (pos 5 0) ≡ pos 5 5
  apply (forward 8) (pos 5 5) ≡ pos 13 5

Note that the triple equal sign is different from what we used above. Roughly, this is because it is the proposition, that some tings are equal, while the normal equal sign above, was used to make definitions. The code doesn’t type check as is. We haven’t defined ‘apply’ and it is not valid Agda to just write down equations like that. Let’s fix the latter problem first, by turning it into declarations and definitions. This will actually define elements of the datatypes of equality proofs – but I’m pretty sure you can accept these changes just as boilerplate we have to add to our equations:

  example1 : apply (forward 5) (pos 0 0) ≡ pos 5 0
  example1 = refl

  example2 : apply (down 5) (pos 5 0) ≡ pos 5 5
  example2 = refl

  example3 : apply (forward 8) (pos 5 5) ≡ pos 13 5
  example3 = refl

Now, to make the examples type-check, we have to define ‘pos’ and ‘apply’. Positions can be done analogous to commands:

  data Position : Set where
    pos : Nat -> Nat -> Position

(Here, the type of ‘pos’ just tells us, that it is a function taking two natural numbers as arguments.) Now we are ready to start with ‘apply’:

   apply : Command → Position → Position
    apply = ?

So apply is a function, that takes a ‘Command’ and a ‘Position’ and returns another ‘Position’. For the definition of ‘apply’ I just entered a questionmark ‘?’. It is one of my favorite features of Agda, that terms can be left out like this before type checking. Agda still checks everything we have given so far and will give us a lot of information about what ‘?’ could be. This is called ‘interacting with a hole’. Because, well, it is a hole in your code and the type checker is there to tell you, which things might fit into this hole. After type checking, the hole and what Agda tells us about it, will look like this:

   apply : Command → Position → Position
    apply = ?

Goal: Command -> Position -> Position

This was type-checked with a couple of imports – see my final version of the code if you want to reproduce. The first thing Agda tells us, is the type of the goal and then there is some mumbling about constraints with some fragments, that look like they have something to do with the examples from above – the latter is actually not information about the hole, but general information about the type checking. So lets look at them to see, if the type checker has to say anything:

The refl terms in the definitions of the examples, are highlighted yellow

Something is yellow! This is Agda’s way to tell us, that it does not have enough information to decide, if everything is okay. Which makes a lot of sense, since we haven’t given a definition of ‘apply’ and these equations are about values computed with ‘apply’. So let us just continue to define ‘apply’ and see if the yellow vanishes. This is analogous to the stage in TDD were your tests don’t pass because your code does not yet compile.

We will use pattern matching on the given ‘Command’ and ‘Position’ to define ‘apply’ – the cases below were generated by Agda (I only changed variable names), and we now have a hole for each case:

  apply : Command → Position → Position
  apply (forward x) (pos h d) = {!!}
  apply (up x) (pos h d) = {!!}
  apply (down x) (pos h d) = {!!}

There are various ways in which Agda can use the information given by types to help us with filling these holes. First of all, we can just ask Agda to make the hole ‘smaller’ if there is a unique canonical way to do so. This will work here, since ‘Position’ has only one constructor. So we get new holes for arguments of the constructor ‘pos’ and can try to fill those.

Let us focus on the first case and see what happens if we enter something not in line with our tests:

If we ask Agda, if ‘h+d’ fits into the ‘hole’, it will say no and tell us what the problem is in the following way:

While this is essentially the same kind of feedback you would get from a unit test, there are at least two important advantages to note:

  • This is feedback from the type checker and it is combined with other things the type checker can tell you. It means you get a lot of feedback at once, when you ask Agda, if something you wrote fits into a hole.
  • ‘refl’ is only a simple case of the proves you can write in Agda. More complicated ones need some training, but you can go way beyond unit tests and ‘check’ infinitely many cases or even better: all cases.

If you want more, just try Agda yourself! One easy way to do that, is to use Ingo Blechschmidt’s Agdapad, which let’s you try Agda in your browser.

X Forwarding from Linux to Windows

This is a very short statement of joy in that I found something I thought of being very complicated – actually turned out to be done quite easy.

We have a multitude of clients with a multitude of infrastructures. Then there is home office and still a Corona pandemic (according to individual voices..?), so all in all, one does not always have a Linux system at hand when working on a Linux project.

SSH access is usually easy, but if you need graphical UIs, that would be a problem, because the X Window System (X11) that is commonplace for the graphical display of Linux and the standard ssh client allows X11 forwarding via the command line option ssh -X out of the box.

Now Windows is a different story, but it turns out that the right tools… just exist. This is the short version:

  1. There is the Xming Public Domain version (last release in 2016) which you can get from e.g. here and is straightforward to install. This plays the role of a X Server, e.g. a software-side display that can receive data via ssh.
  2. After the straighinstallation, call XLaunch to setup
  3. The “Multiple Windows” option is fine, as is using “Display number 0”. I then opted to “start no client” and ignored the other options.
  4. I already use PuTTY for everything else (including SSH tunnels to various remote networks), and while this has a somewhat objectionable user interface, one can manage. If you have an existing session, make sure to select that first, then click Load, then adjust the settings as follows.
  5. go to Connection > SSH > X11
  6. Enable X11 Forwarding
  7. in “X display location”, enter “localhost:0” if you chose “Display number 0” in step 3. Or choose accordingly.
  8. You might now save (or don’t) and open the connection.

I was more than surprised just to be able to start any gui application on our client’s remote machine and seeing the result.

Sure you might get a few seconds delay, but compared to the hassle I expected – this was a walk in the park.

A big shoutout to the creators of Xming and PuTTY, well deserved.

Automated instance construction in C++

I’m currently mostly switching back and forth between C# and C++ projects. One of the things that I’m missing most when switching to C++ is a nice dependency-injection (DI) library. After checking out what was already available, I finally decided I wanted to try to build my own slim type-indexed variant. I quickly started by registering factories and instances in a map on std::type_index, making it possible to both have the DI retain ownership (with std::unique_ptr) or just make a type available via a bare pointer. So I was able to do things like:

// Register an instance
di.insert_unique(std::make_unique<foo_service>());
// Register a factory
di.insert_unique([] {return std::make_unique<bar_service>());
// Register an existing bare pointer
di.insert_bare(my_bare_thingy);

// ... and retrieve them
auto& foo = di.get<foo_service>();

One of the most powerful aspects of a DI library is the ability to transitively setup dependencies. I like constructor injection the most, so I implemented a very naive way like this:

di.insert_unique([](auto& p) { return std::make_unique<complex_service>(
  p.get<base_service1>(), p.get<base_service2>(), p.get<base_service3>());
});

This is pretty verbose and we basically have to repeat all the constructor parameter types. But it’s easy to implement. We can do a little bit better by using a templated type-conversion operator and using it to call the get:

class service_provider
{
  struct inferred_locator
  {
    service_provider const* provider;
    template <class T> operator T&() const
    {
      return provider->get<std::remove_const_t<T>>();
    }
  };
  
  inferred_locator get() const
  {
    return { .provider = this };
  }
  
  /** typed get implementations here... */
};

Now we can reduce the previous registration to:

di.insert_unique([](auto& p) { 
  return std::make_unique<complex_service>(p.get(), p.get(), p.get());
});

That is basically only the number of constructor parameters in a verbose way. We could write a small template that takes the number, creates an std::index_sequence from it and then unpacks each index into an invokation of service_provider::get. But then we would still have to update registrations when adding (or removing) a new dependency to a services’s constructor. With a litte more work, we can actually get this instead:

di.insert_unique<complex_service>();

This partly inspired by Antony Polukhin’s C++ reflection talks, and combines std::index_sequence based unpacking, SFINEA and the templated type-conversion operator:

template <class T, std::size_t Head, std::size_t... Rest>
constexpr auto make_unique_impl(provider_wrapper const& p,
    std::index_sequence<Head, Rest...>,
    decltype(T{ mimic{ Head }, mimic{ Rest }... }) * = nullptr) -> std::unique_ptr<T>
{
    // This next requirement is so we do not accidentally recurse into the copy/move-ctors
    static_assert(sizeof...(Rest) + 1 > 1, "Can only deduce constructors with two or more parameters.");
    return std::make_unique<T>(p(Head), p(Rest)...);
}

template <class T, std::size_t... Rest>
constexpr auto make_unique_impl(provider_wrapper const& p, std::index_sequence<Rest...>) -> std::unique_ptr<T>
{
    // This next requirement is so we do not accidentally recurse into the copy/move-ctors
    static_assert(sizeof...(Rest) > 1, "Can only deduce constructors with two or more parameters.");
    return make_unique_impl<T>(p, std::make_index_sequence<sizeof...(Rest) - 1>{});
}

template <class T, std::size_t Max = 8> auto make_unique(service_provider const& p)
{
    return make_unique_impl<T>(provider_wrapper{ &p }, std::make_index_sequence<Max>{});
}

This uses two new types: mimic, which is only used for SFINEA, takes std::size_t on construction (for the unpacking from the std::index_sequence) and converts to anything (templated type conversion again) and the provider_wrapper, which is a simple adaptor around service_provider that takes an unused std::size_t argument (again, for unpacking). The first overload of make_unique_impl is slightly more specialized (because it has Head and Rest), so the compiler tries it first. If it works, it returns a new instance of the service we want. Otherwise, it will fail without an error due to SFINEA in the unused and defaulted third parameter. The compiler will then try the second overload, which will recurse to a variant with fewer parameters. The outermost make_unique starts this recursion with 8 parameters, because that should be enough for any sane service. I stop this recursion at one constructor parameter, even though that is a useful configuration. This is because I have not yet found a way to avoid calling the copy or move constructors accidentally. If anyone knows how to do that, I’d be very happy to hear how. My workaround right now is to explicitly register a factory in that case.

Range Types in PostgreSQL

How do you store ranges in an SQL database? By ranges I mean things like price ranges, temperature ranges, date ranges for scheduling, etc. You’d probably represent them with two columns in a table, like min_price and max_price, min_temperature and max_temperature, start_date and end_date. If you want to represent an unbounded range, you’d probably make one or both columns nullable and then take NULL as +/- infinity.

If you want to test if a value is in a range you can use the BETWEEN operator:

SELECT * FROM products WHERE
  target_price BETWEEN min_price AND max_price;

This doesn’t work as nicely anymore if you work with unbounded ranges as described above. You’d have to add additional checks for NULL. What if you want to test if one of the ranges in the table overlaps with a given range?

SELECT * FROM products WHERE
  max_given >= min_price AND
  min_given <= max_price;

Did I make a mistake here? I’m not sure. What if they should overlap but not cover each other? And again, this becomes even more complicated with unbounded ranges.

Enter range types

PostgreSQL has a better solution for these problems — range types. It comes with these additional built-in data types:

  • int4range: Range of integer
  • int8range: Range of bigint
  • numrange: Range of numeric
  • tsrange: Range of timestamp without time zone
  • tstzrange: Range of timestamp with time zone
  • daterange: Range of date

You can use them as a column type in a table:

CREATE TABLE products (…, price_range numrange);

Construction

You can construct range values for these types like this:

'[20,35]'::int4range
'(5,12]'::int4range
'(6.2,12.5)'::numrange
'[2022-05-01, 2022-05-31]'::daterange
'[9:30, 12:00)'::timerange

As you can see, they use mathematical interval notation. A square bracket means inclusive bound, and a round parenthesis means exclusive bound. They can also be unbounded (infinite) or empty:

'[5,)'::int4range
'(,20]'::int4range
'empty'::int4range

You can get the bounds of a range individually with the lower() and upper() functions:

SELECT * FROM products ORDER BY lower(price_range);

Operators

The range types become really powerful through the range operators. There are a lot, so I will only show some basic examples:

  • The && operators tests if two ranges overlap: range_a && range_b
  • The @> and <@ operators test if the first range contains the second or vice versa: range_a <@ range_b. If used with an element on one side they test if the element is in a range: element <@ range or range @> element.
  • The -|- operator tests if two ranges are adjacent: range_a -|- range_b

Additionally to these boolean tests you can also calculate new ranges based on existing ranges:

The + operator computes the union of two overlapping or adjacent ranges: range_a + range_b. The * computes the intersection of ranges, and the - operator the difference.

Multiranges

There is one more thing I want to mention: For each one of the range types there is also a multirange type: int4multirange, int8multirange, nummultirange, tsmultirange, tstzmultirange, datemultirange. As their names suggest, they store multiple ranges in one value:

'{}'::int4multirange
'{[2,9)}'::int4multirange
'{[2,9), [12,20)}'::int4multirange

The mentioned range operators work with them as well.

Packaging Java-Project as DEB-Packages

Providing native installation mechanisms and media of your software to your customers may be a large benefit for them. One way to do so is packaging for the target linux distributions your customers are running.

Packaging for Debian/Ubuntu is relatively hard, because there are many ways and rules how to do it. Some part of our software is written in Java and needs to be packaged as .deb-packages for Ubuntu.

The official way

There is an official guide on how to package java probjects for debian. While this may be suitable for libraries and programs that you want to publish to official repositories it is not a perfect fit for your custom project that you provide spefically to your customers because it is a lot of work, does not integrate well with your delivery pipeline and requires to provide packages for all of your dependencies as well.

The convenient way

Fortunately, there is a great plugin for ant and maven called jdeb. Essentially you include and configure the plugin in your pom.xml as with all the other build related stuff and execute the jdeb goal in your build pipeline and your are done. This results in a nice .deb-package that you can push to your customers’ repositories for their convenience.

A working configuration for Maven may look like this:

<build>
    <plugins>
        <plugin>
            <artifactId>jdeb</artifactId>
            <groupId>org.vafer</groupId>
            <version>1.8</version>
            <executions>
                <execution>
                    <phase>package</phase>
                    <goals>
                        <goal>jdeb</goal>
                    </goals>
                    <configuration>
                        <dataSet>
                            <data>
                                <src>${project.build.directory}/${project.build.finalName}-jar-with-dependencies.jar</src>
                                <type>file</type>
                                <mapper>
                                    <type>perm</type>
                                    <prefix>/usr/share/java</prefix>
                                </mapper>
                            </data>
                            <data>
                                <type>link</type>
                                <linkName>/usr/share/java/MyProjectExecutable</linkName>
                                <linkTarget>/usr/share/java/${project.build.finalName}-jar-with-dependencies.jar</linkTarget>
                                <symlink>true</symlink>
                            </data>
                            <data>
                                <src>${project.basedir}/src/deb/MyProjectStartScript</src>
                                <type>file</type>
                                <mapper>
                                    <type>perm</type>
                                    <prefix>/usr/bin</prefix>
                                    <filemode>755</filemode>
                                </mapper>
                            </data>
                        </dataSet>
                    </configuration>
                </execution>
            </executions>
        </plugin>
    </plugins>
</build>

If you are using gradle as your build tool, the ospackage-plugin may be worth a look. I have not tried it personally, but it looks promising.

Wrapping it up

Packaging your software for your customers drastically improves the user experience for users and administrators. Doing it the official debian-way is not always the best or most efficient option. There are many plugins or extensions for common build systems to conveniently build native packages that may easier for many use-cases.