Grails Domain update optimisation

As many readers may know we are developing and maintaining some Grails applications for more than 10 years now. One of the main selling points of Grails is its domain model and object-relational-mapper (ORM) called GORM.

In general ORMs are useful for easy and convenient development at the cost of a bit of performance and flexibility. One of the best features of GORM is the availability of several flexible APIs for use-cases where dynamic finders are not enough. Let us look at a real-world example.

The performance problem

In one part of our application we have personal messages that are marked as read after viewing. For some users there can be quite a lot messages so we implemented a “mark all as read”-feature. The naive implementation looks like this:

def markAllAsRead() {
    def user = securityService.loggedInUser
    def messages = Messages.findAllByUserAndTimelineEntry.findAllByAuthorAndRead(user, false)
    messages.each { message ->
        message.read = true
        message.save()
    }
    Messages.withSession { session -> session.flush()}
 }

While this is both correct and simple it only works well for a limited amount of messages per user. Performance will degrade because all the domain objects are loaded into domain objects, then modified and save one-by-one to the session. Finally the session is persisted to the database. In our use case this could take several seconds which is much too long for a good user experience.

DetachedCriteria to the rescue

GORM offers a much better solution for such use-cases that does not sacrifice expressiveness. Instead it offers a succinct API called “Where Queries” that creates DetachedCriteria and offers batch-updates.

def markAllAsRead() {
    def user = securityService.loggedInUser
    def messages = Messages.where {
        read == false
        addressee == user
    }
    messages.updateAll(read: true)
}

This implementation takes only a few milliseconds to execute with the same dataset as above which is de facto native SQL performance.

Conclusion

Before cursing GORM for bad performance one should have a deeper look at the alternative querying APIs like Where Queries, Criteria, DetachedCriteria, SQL Projections and Restrictions to enhance your ORM toolbox. Compared to dynamic finders and GORM-methods on domain objects they offer better composability and performance without resorting to HQL or plain SQL.

The Optional Wildcast

This blog post presents a particular programming technique that I happen to use more often in recent months. It doesn’t state that this technique is superior or more feasible than others. It’s just a story about a different solution to an old programming problem.

Let’s program a class hierarchy for animals, in particular for mammals and birds. You probably know where this leads up to, but let’s start with a common solution.

Both mammals and birds behave like animals, so they are subclasses of it. Birds have the additional behaviour of laying eggs for reproduction. We indicate this feature by implementing the Egglaying interface.

Mammals feed their offsprings by giving them milk. There are two mammals in our system, a cow and the platypus. The cow behaves like the typical mammal and gives a lot of milk. The platypus also feeds their young with milk, but only after they hatched from their egg. Yes, the platypus is a rare exception in that it is both a mammal and egglaying. We indicate this odd fact by implementing the Egglaying interface, too.

If our code wants to access the additional methods of the Egglaying interface, it has to check if the given object implements it and then upcasts it. I call this type of cast “wildcast” because they seem to appear out of nowhere when reading the code and seemingly don’t lead up or down the typical type hierarchy. Why would a mammal lay eggs?

One of my approaches that I happen to use more often recently is to indicate the existence of real wildcast with a Optional return type. In theory, you can wildcast from anywhere to anyplace you want. But only some of these jumps have a purpose in the domain. And an explicit casting method is a good way to highlight this purpose:

public abstract class Mammal {
	public Optional<Egglaying> asEgglaying() {
		return Optional.empty();
	}
}

The “asEgglaying()” method might return an Egglaying object, or it might not. As you can see, on default, it returns only an empty Optional. This means that no cow, horse, cat or dog has to think about laying eggs, they just aren’t into it by default.

public class Platypus extends Mammal implements Egglaying {
	@Override
	public Optional<Egglaying> asEgglaying() {
		return Optional.of(this);
	}
}

The platypus is another story. It is the exception to the rule and knows it. The code “Optional.of(this)” is typical for this coding technique.

A client that iterates over a collection of mammals can now incorporate the special case with more grace:

for (Mammal each : List.of(mammals())) {
	each.lactate();
	each.asEgglaying().ifPresent(Egglaying::breed);
}

Compare this code with a more classic approach using a wildcast:

for (Mammal each : List.of(mammals())) {
	each.lactate();
	if (each instanceof Egglaying) {
		((Egglaying) each).breed();
	}
}

My biggest grief with the classic approach is that the instanceof is necessary for the functionality, but not guided by the domain model. It comes as a surprise and has no connection to the Mammal type. In the Optional wildcast version, you can look up the callers of “asEgglaying()” and see all the special code that is written for the small number of mammals that lay eggs. In the classic approach, you need to search for conditional upcasts or separate between code for birds and special mammal code when looking up the callers.

In my real-world projects, this “optional wildcast” style facilitates domain discovery by code completion and seems to lead me to more segregated type systems. These impressions are personal and probably biased, so I would like to hear from your experiences or at least opinions in the comments.

Cancellation Token in C#

Perhaps you know this problem from your first steps with asynchronous functions. You want to call them, but a CancellationToken is requested. But where does it come from if you do not have one at hand? At first glance, a quick way is to use an empty CancellationToken. But is that really the solution?

In this blog article, I explain what CancellationTokens are for and what the CancellationTokenSource is for.

Let us start with a thought experiment: You have an Application with different Worker threads. The Application is closed. How can you make the whole Application with all the Worker threads close?
One solution would be a global bool variable “closed”, which is set in this case. All Workers have a loop built in, which regularly checks this bool and terminates if it is set. Basically, the Application and all Workers must cooperate and react to the signal.

class Application
{
    public static bool closed = false;
    Application()
    {
        var worker= new Worker();
        worker.Run();

        Thread.Sleep(2500);
        closed = true;

        // Wait for termination of Tasks
        Thread.Sleep(2500);
    }
}
class Worker
{
    public async void Run()
    {
        bool moreToDo = true;
        while (moreToDo)
        {
            await DoSomething();
            if (Application.closed)
            {
                return;
            }
        }
    }
}

This is what the CancellationToken can be used for. The token is passed to the involved Workers and functions. Throught it, the status of whether or not it was cancelled can be queried. The token is provided via a CancellationTokenSource, which owns and manages the token.

class Application
{
    Application()
    {
        var source = new CancellationTokenSource();
        var token = source.Token;
        var worker = new Worker();
        worker.Run(token);

        Thread.Sleep(2500);
        source.Cancel();

        // Wait for cancellation of Tasks and dispose source
        Thread.Sleep(2500);
        source.Dispose();
    }
}
class Worker
{
    public async void Run(CancellationToken token)
    {
        bool moreToDo = true;
        while (moreToDo)
        {
            await DoSomething(token);
            if (token.IsCancellationRequested)
            {
                return;
            }
        }
    }   
}

The call to source.Cancel() sets the token to canceled. CancellationTokens and sources offer significantly more built-in possibilities, such as callbacks, WaitHandle, CancelAfter and many more.

So using an empty CancellationToken is not a good solution. This means that cancellations cannot be passed on and a task could continue to run in the background even though the Application should have been terminated a long time ago.

I hope I have been able to bring the topic a little closer to you. If you have already encountered such problems and found solutions, please leave a comment.

Using Docker Containers in Development with WebStorm: Next Iteration

We are always in pursue of improving our build and development infrastructures. Who isn’t?

At Softwareschneiderei, we have about five times as many projects than we have developers (without being overworked, by the way) and each of that comes with its own requirements, so it is important to be able to switch between different projects as easily as cloning a git repository, avoiding meticulous configuration of your development machines that might break on any change.

This is the main advantage of the development container (DevContainer) approach (with Docker being the major contestant at the moment), and last November, I tried to outline my then-current understanding of integrating such an approach with the JetBrains IDEs. E.g. for WebStorm, there is some kind of support for dockerized run configurations, but that does some weird stuff (see below), and JetBrains did not care enough yet to make that configurable, or at least to communicate the sense behind that.

Preparing our Dev Container

In our projects, we usually have at least two Docker build stages:

  • one to prepare the build platform (this will be used for the DevContainer)
  • one to execute the build itself (only this stage copies actual sources)

There might be more (e.g. for running the build in production, or for further dependencies), but the basic distinction above helps us to speed up the development process already. (Further reading: Docker cache management)

For one of our current React projects (in which I chose to try Vite in favor of the outdated Create-React-App, see also here), the Dockerfile might look like

# --------------------------------------------
FROM node:18-bullseye AS build-platform

WORKDIR /opt
COPY package.json .
COPY package-lock.json .

# see comment below
RUN npm install -g vite

RUN npm ci --ignore-scripts
WORKDIR /opt/project

# --------------------------------------------
FROM build-platform AS build-stage

RUN mkdir -p /build/result
COPY . .
CMD npm run build && mv dist /build/result/app

The “build platform” stage can then be used as our Dev Container, from the command line as (assuming, this Dockerfile resides inside your project directory where also src/ etc. are chilling)

docker build -t build-platform-image --target build-platform .
docker run --rm -v ${PWD}:/opt/project <command_for_starting_dev_server>

Some comments:

  • The RUN step to npm install -g vite is required for a Vite project because the our chosen base image node:18-bullseye does not know about the vite binaries. One could improve that by adding another step beforehand, only preparing a vite+node base image and taking advantage of Docker caching from then on.
  • We specifically have to take the WORKDIR /opt/project because our mission statement is to integrate the whole thing with WebStorm. If you are not interested in that, that path is for you to choose.

Now, if we are not working against any idiosyncrasies of an IDE, the preparation step “npm ci” gives us all our node dependencies in the current directory inside a node_modules/ folder. Because this blog post is going somewhere, already now we chose to place that node_modules in the parent folder of the actual WORKDIR. This will work because for lack of an own node_modules, node will find it above (this fact might change with future Node versions, but for now it holds true).

The Challenge with JetBrains

Now, the current JetBrains IDEs allow you to run your project with the node interpreter (containerized within the node-platform image) in the “Run/Debug Configurations” window via

“+” ➔ “npm” ➔ Node interpreter “Add…” ➔ “Add Remote” ➔ “Docker”

then choose the right image (e.g. build-platform-image:latest).

Now enters that strange IDE behaviour that is not really documented or changeable anywhere. If you run this configuration, your current project directory is going to be mounted in two places inside the container:

  • /opt/project
  • /tmp/<temporary UUID>

This mounting behaviour explains why we cannot install our node_modules dependencies inside the container in the /opt/project path – mounting external folders always override anything that might exist in the corresponding mount points, e.g. any /opt/project/node_modules will be overwritten by force.

As we cared about that by using the /opt parent folder for the node_modules installation, and we set the WORKDIR to be /opt/project one could think that now we can just call the development server (written as <command_for_starting_dev_server> above).

But we couldn’t!

For reasons that made us question our reality way longer than it made us happy, it turned out that the IDE somehow always chose the /tmp/<uuid> path as WORKDIR. We found no way of changing that. JetBrains doesn’t tell us anything about it. the “docker run -w / --workdir” parameter did not help. We really had to use that less-than-optimal hack to modify the package.json “scripts” options, by

 "scripts": {
    "dev": "vite serve",
    "dev-docker": "cd /opt/project && vite serve",
    ...
  },

The “dev” line was there already (if you use create-react-app or something else , this calls that something else accordingly). We added another script with an explicit “cd /opt/project“. One can then select that script in the new Run Configuration from above and now that really works.

We do not like this way because doing so, one couples a bad IDE behaviour with hard coded paths inside our source files – but at least we separate it enough from our other code that it doesn’t destroy anything – e.g. in principle, you could still run this thing with npm locally (after running “npm install” on your machine etc.)

Side note: Dealing with the “@esbuild/linux-x64” error

The internet has not widely adopteds Vite as a scaffolding / build tool for React projects yet and one of the problems on our way was a nasty error of the likes

Error: The package "esbuild-linux-64" could not be found, and is needed by esbuild

We found the best solution for that problem was to add the following to the package.json:

"optionalDependencies": {
    "@esbuild/linux-x64": "0.17.6"
}

… using the “optionalDependencies” rather than the other dependency entries because this way, we still allow the local installation on a Windows machine. If the dependency was not optional, npm install would just throw an wrong-OS-error.

(Note that as a rule, we do not like the default usage of SemVer ^ or ~ inside the package.json – we rather pin every dependency, and do our updates specifically when we know we are paying attention. That makes us less vulnerable to sudden npm-hacks or sneaky surprises in general.)

I hope, all this information might be useful to you. It took us a considerable amount of thought and research to come to this conclusion, so if you have any further tips or insights, I’d be glad to hear from you!

Unit-Testing Deep-Equality in C#

In the suite of redux-style applications we are building in C#, we are making extensive use of value-types, which implies that a value compares as equal exactly if all of its contents are equal also known as “deep equality”, as opposed to “reference equality” or “shallow equality”. Both of those imply deep equality, but the other way around is not true. The same object is of course equal to itself, not matter how deep you look. And an object that references the same data as another object also has equal content. But a simple object that contains different lists with equal content will be unequal under shallow comparison, but equal under deep comparison.

Though init-only records already provide a per-member comparison as Equals be default, this fails for collection types such as ImmutableList<> that, against all intuition but in accordance to , only provide reference-equality. For us, this means that we have to override Equals for any value type that contains a collection. And this is were the trouble starts. Once Equals is overridden, it’s extremely easy to forget to also adapt Equals when adding a new property. Since our redux-style machinery relies on a proper “unequal”, this would manifest in the application as a sporadically missing UI update.

So we devised a testing strategy for those types, using a little bit of reflection:

  1. Create a sample instance of the value type with no member retaining its default value
  2. Test, by going over all properties and comparing to the same property in a default instance, if indeed all members in the sample are non-default
  3. For each property, run Equals the sample instance to a modified sample instance with that property set to the value from a default instance.

If step 2 fails, it means there’s a member that’s still at its default value in the sample instance, e.g. the test wasn’t updated after a new property was added. If step 3 fails, the sample was updated, but the new property is not considered in Equals – and it can even tell which property is missing.

The same problems of course arise with GetHashCode, but are usually less severe. Forgetting to add a property just makes collisions more likely. It can be tested much in the same way, but can potentially lead to false positives: collisions can occur even if all properties are correctly considered in the function. In that case, however, the sample can usually be altered to remove the collision – and it is really unlikely. In fact, we never had a false positive.

Format-based sorting looks clever, but is dangerous

A neat trick I learnt early in my career, even before I learnt about version control, was how to format a date as a string so that alphabetically sorted lists would contain them in the “correct” order:

“YYYYMMDD” is the magic string.

If you format your dates as 20230122 and 20230123, the second name will be sorted after the first one. With nearly any other format, your date strings will not be sorted chronologically in the file system.

I’ve found out that this is also nearly the only format that most people cannot intuitively recognize as a date. So while it is familiar with me and conveniently sorted, it is confusing or at least in need of explanation for virtually every user of my systems.

Keep that in mind when listening to the following story:

One project I adopted is a custom enterprise resource planning system that was developed by a single developer that one day left the company and the code behind. The software was in regular use and in dire need of maintenance and new features.

One concept in the system is central to its users: the list of items in an invoice or a bill of delivery. This list contains items in a defined order that is important to the company and its customers.

To my initial surprise, the position of an item in the list was not defined by an integer, but a string. This can be explained by the need of “sub-positions” that form a hierarchy of items, like in this example:

1 – basic item

1.1 – item upgrade #1

1.2 – item upgrade #2

Both positions “1.1” and “1.2” are positioned “underneath” position “1” and should be considered glued to it. If you move position “1” to position “4”, you also move 1.1 to 4.1 and 1.2 to 4.2.

But there was a strange formatting thing going on with the positions: They were stored as strings in the database, but with a strange padding in front. Instead of “1”, “2” and “3”, the entries contained the positions ” 1″, ” 2″ and ” 3″. All positions were prefixed with two space characters!

Well, nearly all positions. As soon as the list grew, the padding turned out to be dependent on the number of digits in the position: ” 9″, but then ” 10″ and “100”.

The reason can be found relatively simple: If you prefix with spaces (or most other characters, maybe “0”), your strings will be ordered in a numerical way. Without the prefixes, they would be sorted like “1”, “10”, “11”, “2”.

That means that the desired ordering of the positions is hardcoded in the database representation! You probably already thought about the case of a position greater than 999. That’s when trouble begins! Luckily, an invoice with a thousand items on the list is unheard of in the company (yet!).

Please note that while the desired ordering is hardcoded in the database, the items are still loaded in a different order (as they were entered into the system) and need to be sorted by the application. The default sorting for strings is the alphabetical order, so the original developer probably was clever/lazy, went with it and formatted the data in a way that would produce the result and not require additional logic during the sorting.

If you look at the code, you see seemingly strange formatting calls to the position all over the place. This is necessary because, for example, every time a user enters a position into the system, it needs to be reformatted (or at least sanitized) in order to adhere to the “auto-sortable” format.

If you wonder how a hierarchical sub-position looks like with this format, its ” 1. 1″, ” 1. 10″ and even ” 1. 17. 2. 4″. The database stores mostly blanks in this field.

While this approach might seem clever at the moment, it is highly dangerous. It conflates several things that should stay separated, like “storage format” and “display format”, “item order” or just “valid value range”. It is a clear violation of the “separation of concerns” principle. And it broke the application when I missed one place where the formatting was required, but not present. Of course, this only manifests in a problem when your test cases (or manual tries) exceed a list of 9 entries – lesson learnt here.

I dread the moment when the company calls to tell me about this “unusually large invoice” that exceeds the 999 limit. This would mean a reformatting of all stored data or another even more clever hack to circumvent the problem.

Did you encounter a format that was purely there for sorting in the wild? What was the story? Tell us in the comments!

How comments get you through a code review

Code comments are a big point of discussion in software development. How and where to use comments. Or should you comment at all? Is the code not enough documentation if it is just written well enough? Here I would like to share my own experience with comments.

In the last months I had some code reviews where colleagues looked over my merge requests and gave me feedback. And it happened again and again that they asked questions why I do this or why I decided to go this way.
Often the decisions had a specific reason, for example because it was a customer requirement, a special case that had to be covered or the technology stack had to be kept small.

That is all metadata that would be tedious and time-consuming for reviewers to gather. And at some point, it is no longer a reviewer, it is a software developer 20 years from now who has to maintain the code and can not ask you questions any more . The same applies if you yourself adjust the code again some time later and can not remember your thoughts months ago. This often happens faster than you think. To highlight how fast details disappear here is a current example: This week I set up a new laptop because the old one had a hardware failure. I did all the steps only half a year ago. But without documentation, I would not have been able to reconstruct everything. And where the documentation was missing or incomplete, I had to invest effort to rediscover the required steps.

Example

Here is an example of such a comment. In the code I want to compare if the mixer volume has changed after the user has made changes in the setup dialog.

var setup = await repository.LoadSetup(token);

var volumeOld = setup.Mixers.Contents.Select(mixer=>mixer.Volume).ToList();

setup = Setup.App.RunAsDialog(setup, configuration);

var volumeNew = setup.Mixers.Contents.Select(mixer=>mixer.Volume).ToList();
if (volumeNew == volumeOld)
{
     break;
}
            
ResizeToMixerVolume(setup, volumeOld);

Why do I save the volume in an additional variable instead of just writing the setup into a new variable in the third line? That would be much easier and more elegant. I change this quickly – and the program is broken.

This little comment would have prevented that and everyone would have understood why this way was chosen at the moment.

// We need to copy the volumes, because the original setup is partially mutated by the Setup App.
var volumeOld = setup.Mixers.Contents.Select(mixer=>mixer.Volume).ToList();

If you annotate such prominent places, where a lot of brain work has gone into, you make the code more comprehensible to everyone, including yourself. This way, a reviewer can understand the code without questions and the code becomes more maintainable in the long run.



When laziness broke my code

I was just integrating a new task-graph system for a C# machine control system when my tests started to go red. Note that the tasks I refer to are not the same as the C# Task implementation, but the broader concept. Task-graphs are well known to be DAGs, because otherwise the tasks cannot be finished. The general algorithm to execute a task-graph like this is called topological sorting, and it goes like this:

  1. Find the number of dependencies (incoming edges) for each task
  2. Find the tasks that have zero dependencies and start them
  3. For any finished tasks, decrement the follow-up tasks dependency count by one and start them if they reach zero.

The graph that was failed looked like the one below. Task A was immediately followed by a task B that was followed by a few more tasks.

I quickly figured out that the reason that the tests were failing was that node B was executed twice. Looking at the call-stack for both executions, I could see that the first time B was executed was when A was completed. This is correct as per step 3 in the algorithm. However, the second time it was started was directly from the initial Run method that does the work from step 2: Starting the initial tasks that are not being started recursively. I was definitely not calling Run twice, so how did that happen?

public void Run()
{
    var ready = tasks
        .Where(x => x.DependencyCount == 0);

    StartGroup(ready);
}

Can you see it? It is important to note that many of the tasks in this graph are asynchronous. Their completion is triggered by an IObserver, a C# Task completing or some other event. When the event is processed, StartGroup is used to start all tasks that have no more dependencies. However, A was no such task, it was synchronous, so the StartGroup({B}) call happened while Run was still on the stack.

Now what happened was that when A (instantly!) completed, it set the DependencyCount of B to 0. Since ready in the code snippet is lazily evaluated from within StartGroup, the ‘contents’ actually change while StartGroup is running.

The fix was adding a .ToList after the .Where, a unit test that checked that this specifically would not happen again, and a mental note that lazy evaluation can be deceiving.

Use real(istic) data from early on

When developing software in general and also specifically user interfaces (UIs) one important aspect is often neglected: The form, shape and especially the amount of data.

One very common practice is to fill unknown texts with fragments of the famous Lorem ipsum placeholder text. This may be a good idea if you are designing a software for displaying a certain kind of articles similar in size and structure to your placeholder text. In all other cases I would regard using lorem ipsum as a smell.

My recommendation is to collect as many samples of real or at least realistic data as feasible. Use them to build and test your application. Why do I think it matters? Let me elaborate a bit in the following sections.

Data affects the layout

You can only choose a fitting layout if you have knowledge about the length of certain texts, size of image etc. The width of columns can be chosen more appropriately, you can descide if you need scrollbars, if you want them permantently visible for a more stable and calm layout, how large panels or text areas have to be for optimum readability and so on.

Data affects the choice of UI controls

The data your application has to handle should reflect not only in the layout but also in the type of controls to be used.

For example, the amount of options for the user to make a choice from drastically affects the selection of an adequate UI control. If you have only 2 or 3 options toggle buttons, checkboxes or radio buttons next to each other or layed out in one column may be a good fit. If the count of options is greater, dropdowns may be better. At some point maybe a full-blown list with filters, sorting and search may be necessary.

To make a good decision, you have to know the expected amount and shape of your data.

Data affects algorithms and technical decisions regarding performance

The data your system has to work with and to present to the user also has technical impact. If the datasets are moderate in size, you may be able to transfer them all to the frontend and do presentation, filtering etc. there. That has the advantage of reducing backend stress and putting computational effort in the hands of the clients.

Often this becomes unfeasible when the system and its data pool grows. Then you have to think about backend search and filtering, datacompression and the like.

Also algorithmns and datastructure may change from simple lists and linear search to search trees, indexes and lookup tables.

The better you know the scope of your system and the data therein the better your technical decisions can be. You will also be able to judge if the YAGNI principle applies or not.

Conclusion

To quickly sum-up the essence of the advice above: Get to know the expected amount and shape of data your application has to deal with to be able to design your system and the UI/UX accordingly.

Fun with docker container environment variables

Docker (as one specific container technology product) is a basic ingredient of our development infrastructure that steadily gained ground from the production servers over the build servers on our development machines. And while it is not simple when used for operations, the complexity increases a lot when used for development purposes.

One way to express complexity is by making the moving parts configurable and using different configurations. A common way to make things configurable with containers are environment variables. Running a container might look like a endurance typing contest if used extensibly:

docker run --rm \
-e POSTGRES_USER=myuser \
-e POSTGRES_PASSWORD=mysecretpassword \
-e POSTGRES_DB=mydatabase \
-e PGDATA=/var/lib/postgresql/data/pgdata \
ubuntu:22.04 env

This is where our fun begins.

Using an env-file for extensive configurations

The parameter –env-file reads environment variables from a local text file with a simple key=value format:

docker run --rm --env-file my-vars.env ubuntu:22.04 env

The file my-vars.env contains all the variables line by line:

FIRST=1
SECOND=2

If we run the command above in a directory containing the file, we get the following output:

PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=46a23b701dc8
FIRST=1
SECOND=2
HOME=/root

The HOSTNAME might vary, but the FIRST and SECOND environment variables are straight from our file.

The only caveat is that the env-file really has to exist, or we get an error:

docker: open my-vars2.env: Das System kann die angegebene Datei nicht finden.

My beloved shell

The env-file can be empty, contain only comments (use # to begin them) or whitspace, but it has to be present.

Please be aware that the env-files are different from the .env-file(s) in docker-compose. A lot of fun is lost by this simple statement, like variable expansion. As far as I’m aware, there is no .env-file mechanism in docker itself.

But we can have some kind of variable substitution, too:

Using multiple env-files for layered configurations

If you don’t want to change all your configuration entries all the time, you can layer them. One layer for the “constants”, one layer for global presets and one layer for local overrides. You can achieve this with multiple –env-files parameters, they are evaluated in your specified order:

docker run -it --rm --env-file first.env --env-file second.env ubuntu:22.04 env

Let’s assume that the content of first.env is:

TEST=1
FIRST=1

And the content of second.env is:

TEST=2
SECOND=2

The results of our container call are (abbreviated):

TEST=2
FIRST=1
SECOND=2

You can see that the second TEST assignment wins. If you switch the order of your parameters, you would read TEST=1.

Now imagine that first.env is named global.env and second.env is named local.env (or default.env and development.env) and you can see how this helps you with modular configurations. If only the files need not to exist all the time, it would even fit well with git and .gitignore.

The best thing about this feature? You can have as many –env-file parameters as you like (or your operating system allows).

Mixing local and configured environment variables

We don’t have explicit variable expansion (like TEST=${FIRST} or something) with –env-files, but we have a funny poor man’s version of it. Assume that the second.env from the example above contains the following entries:

OS
TEST=2
SECOND=2

You’ve seen that right: The first entry has no value (and no equal sign)! This is when the value is substituted from your operating system:

TEST=2
FIRST=1
OS=Windows_NT
SECOND=2

By just declaring, but not assigning an environment variable it is taken from your own environment. This even works if the variable was already assigned in previous –env-files.

If you don’t believe me, this is a documented feature:

If the operator names an environment variable without specifying a value, then the current value of the named variable is propagated into the container’s environment

https://docs.docker.com/engine/reference/run/#env-environment-variables

And even more specific:

When running the command, the Docker CLI client checks the value the variable has in your local environment and passes it to the container. If no = is provided and that variable is not exported in your local environment, the variable won’t be set in the container.

https://docs.docker.com/engine/reference/commandline/run/#set-environment-variables–e—env—env-file

This is a cool feature, albeit a little bit creepy. Sadly, it doesn’t work in all tools that allow to run docker containers. Last time I checked, PyCharm omitted this feature (as one example).

Epilogue

I’ve presented you with three parts that can be used to manage different configurations for docker containers. There are some pain points (non-optional file existence, feature loss in tools, no direct variable expansion), but also a lot of fun.

Do you know additional tricks and features in regard to environment variables and docker? Comment below or link to your article.