Avoid fragmenting your configuration

Nowadays configuration often is done using environment (aka ENV) variables. They work great using docker/containers, in development and production, on all platforms and using all languages. In short I think environment variables are great for configuration of many aspects of an application.

However, I encountered a pattern in several different applications that I really dislike: Several, fragmented ENV variables for one configurable aspect of the application.

Let us have a look at two examples to see what I mean, then I will try to explain where it could come from and why I think it is bad practice. Finally I will show a better alternative – at least in my opinion.

First real world example

In one javascript app a websocket url was made configurable using 4 (!) ENV variables like this:

WS_PREFIX || "wss://";
WS_HOST || "hostname";
WS_PORT || "";
WS_PATH || "/ws";

function ConnectionString(prefix, host, port, path) {
  return {
    attrib: {
      prefix, 
      host,
      port,
      path,
    },
    string: prefix + host + port + path,
  };
}

We immediately see, that the author wrote a function to deal with the complex configuration in the rest of the application. Not only the devops team or administrators need to supply many ENV variables but they have to supply them in a peculiar way:

The port needs to be specified as :8888, using a leading colon (or the host needs a trailing colon…) which is more than unexpected. The alternative would be a better and more sophisticated implementation of ConnectionString…

Another real example

In the following example the code there are again three ENV variables dealing with hosts, urls and websockets. This examples feels quite convoluted, is hard to understand and definitely needs a refactoring.

TANGOGQL_SOCKET=ws://${TANGO_HOST}:5004/socket

const defaultHost = window.TANGOGQL_HOST ?? "localhost:5004";
const defaultSocketUrl = window.TANGOGQL_SOCKET ?? ws://${defaultHost}/socket;

// dealing with config peculiarities somewhere else
const socketUrl = React.useMemo(() =>
        config.host.replace(/.*:\/\//, "ws://") + "/socket"
    , [config.host]);

Discussion

The examples show clearly that something simple like a configuration for an URL can lead to complicated and hard to use solutions. Most likely the authors tried to not repeat themselves and factored the URLs into the smallest sensible components. While this may sound like a good idea it puts burden on both the developers and the devops team configuring the application.

In my opinion it would be much simpler and more usable for both parties to have complete URLs for the different use cases. Of course this could mean repeating protocols, hostnames and ports if they are the same in the different situations. But just having one or two ENV variables like

WS_URL=wss://myhost:8080/ws
HOST_URL=https://myhost:8080

would be straightforward to use in code and to be configured in the runtime environment. At the same time the chance for errors and the complexity in the configuration is reduced.

Even though certain parts of the URLs are duplicated in the configuration I highly prefer this approach over the presented real world solutions.

Running a Micronaut Service in Docker

Micronaut is a state-of-the-art micro web framework targeting the Java Virtual Machine (JVM). I quite like that you can implement your web application oder service using either Java, Groovy or Kotlin – all using micronaut as their foundation.

You can easily tailor it specifically to your own needs and add features as need be. Micronaut Launch provides a great starting point and get you up and running within minutes.

Prepared for Docker

Web services and containers are a perfect match. Containers and their management software eases running and supervising web services a lot. It is feasible to migrate or scale services and run them separately or in some kind of stack.

Micronaut comes prepared for Docker and allows you to build Dockerfiles and Docker images for your application out-of-the-box. The gradle build files of a micronaut application offer handy tasks like dockerBuild or dockerfile to aid you.

A thing that simplifies deploying one service in multiple scenarios is configuration using environment variables (env vars, ENV).

I am a big fan of using environment variables because this basic mechanism is supported almost everywhere: Windows, Linux, docker, systemd, all (?) programming languages and many IDEs.

It is so simple everyone can deal with it and allows for customization by developers and operators alike.

Configuration using environment variables

Usually, micronaut configuration is stored in YAML format in a file called application.yml. This file is packaged in the application’s executable jar file making it ready to run. Most of the configuration will be fixed for all your deployments but things like the database connection settings or URLs of other services may change with each deployment and are most likely different in development.

Gladly, micronaut provides a mechanism for externalizing configuration.

That way you can use environment variables in application.yml while providing defaults for development for example. Note that you need to put values containing : in backticks. See an example of a database configuration below:

datasources:
  default:
    url: ${JDBC_URL:`jdbc:postgresql://localhost:5432/supermaster`}
    driverClassName: org.postgresql.Driver
    username: ${JDBC_USER:db_user}
    password: ${JDBC_PASSWORD:""}
    dialect: POSTGRES

Having prepared your application this way you can run it using the command line (CLI), via gradle run or with docker run/docker-compose providing environment variables as needed.

# Running the app "my-service" using docker run
docker run --name my-service -p 80:8080 -e JDBC_URL=jdbc:postgresql://prod.databasehost.local:5432/servicedb -e JDBC_USER=mndb -e JDBC_PASSWORD="mnrocks!" my-service

# Running the app using java and CLI
java -DJDBC_URL=jdbc:postgresql://prod.databasehost.local:5432/servicedb -DJDBC_USER=mndb -DJDBC_PASSWORD="mnrocks!" -jar application.jar

Our five types of configuration

Over the years, we came up with a strategy to handle configurable aspects of our software applications. Using five different types of configuration, we are able to provide high customizability with modest effort.

settings © vege / fotoliaConfigurable aspects of software are the magical parts with which you can achieve higher customer satisfaction with relatively modest investment if done right. Your application would be perfect if only this particular factor were of the value three instead of two as it is now. No problem – a little tweak in the configuration files and everything is right. No additional development cost, no compile/build cycle. You can add or increase business value with a simple text editor when things are configurable.

The first problem of this approach is the developer’s decision what to make configurable. Every configurable and therefor variable value of a software system requires some sort of indirection and additional infrastructure. It suddenly counts as user input and needs to be validated and sanitized. If your application environment requires an identifier like a key, the developer needs to come up with a good one, consistent to the existing keys and meaningful enough to make sense to an unsuspecting user. In short, making something configurable is additional and hard work that every developer tries to avoid in the face of tight deadlines and long feature lists.

Over-configurability

Our first approach to configurable content of our applications lead to a situation where everything could be configured, even the name and location of the configuration files themselves. You had to jump through so many hoops to get from the code to the actual value that it was a nightmare to maintain. And it provided virtually no business value at all. No customer ever changed the location of only one configuration file. All they did was to change values inside the configuration files once in a while. Usually, the values were adjusted once at first installation and once some time later, when the improvement of the change could be anticipated. The possibility of the second adjustment usually brings the customer satisfaction.

Under-configurability

So we tried to narrow down the path of configurable aspects by asking our customers for constant values. We are fortunate to have direct customer access and to develop a lot of software based on physics and chemistry, science fields with a high rate of constants. But the attempt to embed natural constants directly in the code failed, too. Soon after we installed the first software of this kind, an important constant related to neutron backscattering was changed – just a bit, but enough to make a difference. Putting important domain values in the code just doesn’t cut it, even if they are labeled constants and haven’t changed for decades.

The five types

A good configurable software application finds the sweet spot between being completely configurable and totally rigid. To help you with this balancing, here are the five types of configuration we identified along our way:

Resources

The section containing the resources of the application isn’t meant to be introspected or edited by the user. It contains mostly binary data like images or media formats and static content like translations. Most resources are even bundled into archive files, so they don’t present themselves as files. All resources are overwritten with every new version of the application, so changing for example an icon is possible, but only has a short-term effect unless it is fed back into the code base. If the resources were deleted, the application would probably boot up, but lack all kind of icons, images and media. Most language content would be replaced by internationalization keys. In short, the application would be usable, but ugly.

(Manufacturer) Settings

We call every configurable option that is definitely predefined by the developer a setting. We group these options into a section called settings. Like resources, settings are overwritten with every new version of the application, so changes should be rare and need to be reported back into development. Settings are configurable if the urgent need arises, but are ultimately owned by the developers and not by the users. The most delicate decision for a developer is to distinguish between a setting and an option. Settings are owned by the developer, options are owned by the user. If the settings were deleted, the application would most likely not boot up or use hard-coded defaults that might not be suited for the given use case. We use settings mostly for feature toggles or dynamically loaded content like menu definitions or team credits.

Options

This is the most interesting type of configurable in terms of user centered business value. Every little bit of information in the option section is only deployed once. As soon as it can be edited by the user, it belongs to the user. We deliver nearly every property, config or ini file as an option. We fill them with nondestructive defaults and adjust the values during the initial deployment, but after that, the user is free to change the files as he likes. This has three important implications for the developer:

  • You can’t rely on the presence of any option entry. Each option entry needs to have a hardcoded fallback value that takes over if the entry is missing in the files.
  • Every new option entry needs to be optional (no pun intended). Since we can’t redeploy the option files, any new entry won’t exist in an existing installation and we can’t force the user to add it. If you can’t find a sensible way to make your option optional, you’re going to have a hard time.
  • If you need to make changes to existing option files, you need to automate it because the number of installations might be huge. We’ve developed our own small domain specific language for update scripts that perform these changes while maintaining readability. Update scripts are the most fragile part of an update deployment and should be avoided whenever possible.

The options are what makes each installation unique, so we take every measure to avoid data loss. All options are in one specific directory tree and can be backuped by a simple copy and paste. Our deliverables don’t contain option files, so they can’t be overwritten by manual copy or extract actions. If the options were deleted, the application would boot up and recreate the initial options with our default values, therefor losing its uniqueness.

(Mutable) Data

The data section is filled with mutable information that gets created by the application itself. It’s more of a database implemented in files than real configuration. The user isn’t encouraged to even look into this section, let alone required to edit anything by hand. If this section would be deleted, the application would lose parts of its current state like lists of pending tasks, but not the carefully adjusted configurables. The application would boot up into a pristine state, but with a suitable configuration.

Archive

The last type isn’t really a configuration, but a place for the application to store the documents it produces as part of its user-related functionality. Only the application writes to the archive, and only in a one-time fashion. Existing content is never altered and rarely deleted. The archive is the place to look for results like measurement data or analysis reports. It’s very important to keep the archive free of any kind of mutable data. If the archive would be deleted, all previously produced result documents would be lost, but the application would work just fine.

Summary

As you’ve seen, we differentiate between five types of configurables, but only two types are “real” configuration: The settings belong to the developer while the options belong to the user. We’ve built over a dozen successful applications using this strategy and are praised for their configurability while our required effort for maintainance is rather low.

Let us know if you have a similar or totally different concept for configurables by dropping a comment.