Using Docker Containers in Development with WebStorm: Next Iteration

We are always in pursue of improving our build and development infrastructures. Who isn’t?

At Softwareschneiderei, we have about five times as many projects than we have developers (without being overworked, by the way) and each of that comes with its own requirements, so it is important to be able to switch between different projects as easily as cloning a git repository, avoiding meticulous configuration of your development machines that might break on any change.

This is the main advantage of the development container (DevContainer) approach (with Docker being the major contestant at the moment), and last November, I tried to outline my then-current understanding of integrating such an approach with the JetBrains IDEs. E.g. for WebStorm, there is some kind of support for dockerized run configurations, but that does some weird stuff (see below), and JetBrains did not care enough yet to make that configurable, or at least to communicate the sense behind that.

Preparing our Dev Container

In our projects, we usually have at least two Docker build stages:

  • one to prepare the build platform (this will be used for the DevContainer)
  • one to execute the build itself (only this stage copies actual sources)

There might be more (e.g. for running the build in production, or for further dependencies), but the basic distinction above helps us to speed up the development process already. (Further reading: Docker cache management)

For one of our current React projects (in which I chose to try Vite in favor of the outdated Create-React-App, see also here), the Dockerfile might look like

# --------------------------------------------
FROM node:18-bullseye AS build-platform

WORKDIR /opt
COPY package.json .
COPY package-lock.json .

# see comment below
RUN npm install -g vite

RUN npm ci --ignore-scripts
WORKDIR /opt/project

# --------------------------------------------
FROM build-platform AS build-stage

RUN mkdir -p /build/result
COPY . .
CMD npm run build && mv dist /build/result/app

The “build platform” stage can then be used as our Dev Container, from the command line as (assuming, this Dockerfile resides inside your project directory where also src/ etc. are chilling)

docker build -t build-platform-image --target build-platform .
docker run --rm -v ${PWD}:/opt/project <command_for_starting_dev_server>

Some comments:

  • The RUN step to npm install -g vite is required for a Vite project because the our chosen base image node:18-bullseye does not know about the vite binaries. One could improve that by adding another step beforehand, only preparing a vite+node base image and taking advantage of Docker caching from then on.
  • We specifically have to take the WORKDIR /opt/project because our mission statement is to integrate the whole thing with WebStorm. If you are not interested in that, that path is for you to choose.

Now, if we are not working against any idiosyncrasies of an IDE, the preparation step “npm ci” gives us all our node dependencies in the current directory inside a node_modules/ folder. Because this blog post is going somewhere, already now we chose to place that node_modules in the parent folder of the actual WORKDIR. This will work because for lack of an own node_modules, node will find it above (this fact might change with future Node versions, but for now it holds true).

The Challenge with JetBrains

Now, the current JetBrains IDEs allow you to run your project with the node interpreter (containerized within the node-platform image) in the “Run/Debug Configurations” window via

“+” ➔ “npm” ➔ Node interpreter “Add…” ➔ “Add Remote” ➔ “Docker”

then choose the right image (e.g. build-platform-image:latest).

Now enters that strange IDE behaviour that is not really documented or changeable anywhere. If you run this configuration, your current project directory is going to be mounted in two places inside the container:

  • /opt/project
  • /tmp/<temporary UUID>

This mounting behaviour explains why we cannot install our node_modules dependencies inside the container in the /opt/project path – mounting external folders always override anything that might exist in the corresponding mount points, e.g. any /opt/project/node_modules will be overwritten by force.

As we cared about that by using the /opt parent folder for the node_modules installation, and we set the WORKDIR to be /opt/project one could think that now we can just call the development server (written as <command_for_starting_dev_server> above).

But we couldn’t!

For reasons that made us question our reality way longer than it made us happy, it turned out that the IDE somehow always chose the /tmp/<uuid> path as WORKDIR. We found no way of changing that. JetBrains doesn’t tell us anything about it. the “docker run -w / --workdir” parameter did not help. We really had to use that less-than-optimal hack to modify the package.json “scripts” options, by

 "scripts": {
    "dev": "vite serve",
    "dev-docker": "cd /opt/project && vite serve",
    ...
  },

The “dev” line was there already (if you use create-react-app or something else , this calls that something else accordingly). We added another script with an explicit “cd /opt/project“. One can then select that script in the new Run Configuration from above and now that really works.

We do not like this way because doing so, one couples a bad IDE behaviour with hard coded paths inside our source files – but at least we separate it enough from our other code that it doesn’t destroy anything – e.g. in principle, you could still run this thing with npm locally (after running “npm install” on your machine etc.)

Side note: Dealing with the “@esbuild/linux-x64” error

The internet has not widely adopteds Vite as a scaffolding / build tool for React projects yet and one of the problems on our way was a nasty error of the likes

Error: The package "esbuild-linux-64" could not be found, and is needed by esbuild

We found the best solution for that problem was to add the following to the package.json:

"optionalDependencies": {
    "@esbuild/linux-x64": "0.17.6"
}

… using the “optionalDependencies” rather than the other dependency entries because this way, we still allow the local installation on a Windows machine. If the dependency was not optional, npm install would just throw an wrong-OS-error.

(Note that as a rule, we do not like the default usage of SemVer ^ or ~ inside the package.json – we rather pin every dependency, and do our updates specifically when we know we are paying attention. That makes us less vulnerable to sudden npm-hacks or sneaky surprises in general.)

I hope, all this information might be useful to you. It took us a considerable amount of thought and research to come to this conclusion, so if you have any further tips or insights, I’d be glad to hear from you!

Fun with docker container environment variables

Docker (as one specific container technology product) is a basic ingredient of our development infrastructure that steadily gained ground from the production servers over the build servers on our development machines. And while it is not simple when used for operations, the complexity increases a lot when used for development purposes.

One way to express complexity is by making the moving parts configurable and using different configurations. A common way to make things configurable with containers are environment variables. Running a container might look like a endurance typing contest if used extensibly:

docker run --rm \
-e POSTGRES_USER=myuser \
-e POSTGRES_PASSWORD=mysecretpassword \
-e POSTGRES_DB=mydatabase \
-e PGDATA=/var/lib/postgresql/data/pgdata \
ubuntu:22.04 env

This is where our fun begins.

Using an env-file for extensive configurations

The parameter –env-file reads environment variables from a local text file with a simple key=value format:

docker run --rm --env-file my-vars.env ubuntu:22.04 env

The file my-vars.env contains all the variables line by line:

FIRST=1
SECOND=2

If we run the command above in a directory containing the file, we get the following output:

PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=46a23b701dc8
FIRST=1
SECOND=2
HOME=/root

The HOSTNAME might vary, but the FIRST and SECOND environment variables are straight from our file.

The only caveat is that the env-file really has to exist, or we get an error:

docker: open my-vars2.env: Das System kann die angegebene Datei nicht finden.

My beloved shell

The env-file can be empty, contain only comments (use # to begin them) or whitspace, but it has to be present.

Please be aware that the env-files are different from the .env-file(s) in docker-compose. A lot of fun is lost by this simple statement, like variable expansion. As far as I’m aware, there is no .env-file mechanism in docker itself.

But we can have some kind of variable substitution, too:

Using multiple env-files for layered configurations

If you don’t want to change all your configuration entries all the time, you can layer them. One layer for the “constants”, one layer for global presets and one layer for local overrides. You can achieve this with multiple –env-files parameters, they are evaluated in your specified order:

docker run -it --rm --env-file first.env --env-file second.env ubuntu:22.04 env

Let’s assume that the content of first.env is:

TEST=1
FIRST=1

And the content of second.env is:

TEST=2
SECOND=2

The results of our container call are (abbreviated):

TEST=2
FIRST=1
SECOND=2

You can see that the second TEST assignment wins. If you switch the order of your parameters, you would read TEST=1.

Now imagine that first.env is named global.env and second.env is named local.env (or default.env and development.env) and you can see how this helps you with modular configurations. If only the files need not to exist all the time, it would even fit well with git and .gitignore.

The best thing about this feature? You can have as many –env-file parameters as you like (or your operating system allows).

Mixing local and configured environment variables

We don’t have explicit variable expansion (like TEST=${FIRST} or something) with –env-files, but we have a funny poor man’s version of it. Assume that the second.env from the example above contains the following entries:

OS
TEST=2
SECOND=2

You’ve seen that right: The first entry has no value (and no equal sign)! This is when the value is substituted from your operating system:

TEST=2
FIRST=1
OS=Windows_NT
SECOND=2

By just declaring, but not assigning an environment variable it is taken from your own environment. This even works if the variable was already assigned in previous –env-files.

If you don’t believe me, this is a documented feature:

If the operator names an environment variable without specifying a value, then the current value of the named variable is propagated into the container’s environment

https://docs.docker.com/engine/reference/run/#env-environment-variables

And even more specific:

When running the command, the Docker CLI client checks the value the variable has in your local environment and passes it to the container. If no = is provided and that variable is not exported in your local environment, the variable won’t be set in the container.

https://docs.docker.com/engine/reference/commandline/run/#set-environment-variables–e—env—env-file

This is a cool feature, albeit a little bit creepy. Sadly, it doesn’t work in all tools that allow to run docker containers. Last time I checked, PyCharm omitted this feature (as one example).

Epilogue

I’ve presented you with three parts that can be used to manage different configurations for docker containers. There are some pain points (non-optional file existence, feature loss in tools, no direct variable expansion), but also a lot of fun.

Do you know additional tricks and features in regard to environment variables and docker? Comment below or link to your article.

Using custom Docker containers for development with WebStorm & Co.

Docker has become one of the go-to tools of many developers these days. Not because any project should implement as many technological buzz words per se, but due to their great deal of flexibility compared with their small hassle of setup.

For stuff like node-based applications, using a Dev Container is useful because in principle, you do not need to have any of the npm stuff on your actual machine – not only you avoid having these monstrous node_modules folders, but also avoid having accidental dependencies on some specific configuration that might hold true on your device, but not generally.

For some of these reasons probably, JetBrains included Docker Dev Containers as a kind of “remote” development. In a sense, a docker container can be thought of as a remote machine, regardless of the fact that it shares your local hardware and is just a software abstraction.

In my opinion, JetBrains usually does great software, but there is some weird behaviour in their usage of Docker Dev Containers and it took us a while to find a quite general and IDE-independent solution; I’ll just use WebStorm as an example of something that appeared unusually hard to tame. I guess it will become better eventually.

For now, one might think of using the built-in config like:

  1. New Run Configuration -> npm
  2. Node Interpreter: “…”
  3. “+” -> Add Remote… -> “Docker”
  4. Use an image of your choice, either one of the node base images or a custom one (see below) with its corresponding tag

Now for reasons that seem to be completely undocumented and unavoidable (tell me if you know more!), the IDE forces you to then mount your project to /opt/project inside this container, where it gets mirrored during runtime to somewhere /tmp/<temporary uuid>/ – and in several of our projects (due to our folder structure which is not even particularly abnormal) this made this option to be completely unusable.

The way one can work without these strange idiosyncrasies is as follows:

First, create a Dockerfile in which you do all the required setup. It might be an optional idea to set the user, away from “root” to something more restricted like “node” (even though in development, you probably have your eyes on everything nevertheless). You can do more custom setup here. This can look like

FROM node:16.18.0-bullseye-slim

WORKDIR /your-home-inside-container
RUN chown node .

COPY package.json package-lock.json /your-home-inside-container

USER node

RUN npm ci --ignore-scripts

# COPY <whatever you might want> <where you want it inside>

EXPOSE 3000

CMD npm start

From that Dockerfile, build a local image in the same folder like:

# you might need -f if the Dockerfile is not named "Dockerfile"
docker build -t your-dev-image .

Then, create a new Run Configuration but choose “Shell script” (not npm)

docker run -it --rm --entrypoint= -v ${PWD}/src:/your-home-inside-container/src -p 0.0.0.0:3000:3000 your-dev-image

You might use a different “-p” port forwarding if you do not want to have your development server broadcasting on port 3000 (another advantage of Dev Containers, you can easily run multiple instances on different ports).

This is about the whole magic. But there are two further things that could be important here:

Hot Reloading (live updating whenever source files change)

This is done rather easily, however seems to change once in a while. We figured out that at least if you are using react-scripts@5.0.1 (which is what “npm start” addresses, unless you do that differently), you just need to set the environment variable “WATCHPACK_POLLING=true”. I.e put that in your Dockerfile a

ENV WATCHPACK_POLLING true

or pass it into your docker run ... -e WATCHPACK_POLLING=true ... your-dev-image line

Routing a development proxy to some “local host”

If your software e.g. adresses a backend that is running on your development machine or another Docker Dev Container, it can not just access that host from inside the Docker container. Neither is the port forwarding via “-p …:…” of any use, because that addresses the other direction – i.e. what port from the container is exposed to outside access – here, we go the other direction.

When the software inside the container would actually want to address “localhost”, it needs to be directed at the host under which your local machine appears. Docker has a special hostname for that and it is host.docker.internal

I.e. if your local backend is running on “localhost:8080” on your machine, you need to tell your Dev Container to direct its requests to “host.docker.internal:8080”.

In one of our projects, we needed some specific control over the proxy that the React development server gives you and here is way to gain that control – add a “setupProxy.js” inside your src/ folder and put in it something like

const { createProxyMiddleware } = require('http-proxy-middleware');

module.exports = function(app) {
    if (process.env.LOCAL_DEVELOPMENT) {
        return;
    }

    let httpProxyMiddleware = createProxyMiddleware({
        target: process.env.REACT_APP_PROXY || 'http://localhost:8080',
        changeOrigin: true,
    });
    app.use('/api', httpProxyMiddleware); // change to your needs accordingly
};

This way, one can always change the address via setting a REACT_APP_PROXY environment variable as in the step above; and one can also disable the whole proxying by setting the LOCAL_DEVELOPMENT env variable to true. Name these as you like, and you can even extend this setupProxy to include web sockets or different proxies for different routes, if you have any questions on that, just comment below 🙂

Running a containerized ActiveDirectory for developers

If you develop software for larger organizations one big aspect is integrating it with existing infrastructure. While you may prefer simple deployments of services in docker containers a customer may want you to deploy to their wildfly infrastructure for example.

One common case of infrastructure is an Active Directory (AD) or plain LDAP service used for organization wide authentication and authorization. As a small company we do not have such an infrastructure ourselves and it would not be a great idea to use it for development anyway.

So how do you develop and test your authentication module without an AD being available for you?

Fortunately, nowadays this is relatively easy using tools like Docker and Samba. Let us see how to put such a development infrastructure up and where the pitfalls are.

Running Samba in a Container

Samba cannot only serve windows shares or act as an domain controller for Microsoft Windows based networks but includes a full AD implementation with proper LDAP support. It takes a small amount of work besides installing Samba in a container to set it up, so we have two small shell scripts for setup and launch in a container. I think most of the Dockerfile and scripts should be self-explanatory and straightforward:

Dockerfile:

FROM ubuntu:20.04

RUN DEBIAN_FRONTEND=noninteractive apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install samba krb5-config winbind smbclient 
RUN DEBIAN_FRONTEND=noninteractive apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install iproute2
RUN DEBIAN_FRONTEND=noninteractive apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install openssl
RUN DEBIAN_FRONTEND=noninteractive apt-get update && DEBIAN_FRONTEND=noninteractive apt-get -y install vim

RUN rm /etc/krb5.conf
RUN mkdir -p /opt/ad-scripts

WORKDIR /opt/ad-scripts

CMD chmod +x *.sh && ./samba-ad-setup.sh && ./samba-ad-run.sh

samba-ad-setup.sh:

#!/bin/bash

set -e

info () {
    echo "[INFO] $@"
}

info "Running setup"

# Check if samba is setup
[ -f /var/lib/samba/.setup ] && info "Already setup..." && exit 0

info "Provisioning domain controller..."

info "Given admin password: ${SMB_ADMIN_PASSWORD}"

rm /etc/samba/smb.conf

samba-tool domain provision\
 --server-role=dc\
 --use-rfc2307\
 --dns-backend=SAMBA_INTERNAL\
 --realm=`hostname`\
 --domain=DEV-AD\
 --adminpass=${SMB_ADMIN_PASSWORD}

mv /etc/samba/smb.conf /var/lib/samba/private/smb.conf

touch /var/lib/samba/.setup

Using samba-ad-run.sh we start samba directly instead of running it as a service which you would do outside a container:

#!/bin/bash

set -e

[ -f /var/lib/samba/.setup ] || {
    >&2 echo "[ERROR] Samba is not setup yet, which should happen automatically. Look for errors!"
    exit 127
}

samba -i -s /var/lib/samba/private/smb.conf

With the scripts and the Dockerfile in place you can simply build the container image using a command like

docker build -t dev-ad -f Dockerfile .

We then run it like follows and use the local mounts to preserve the data in the AD we will be using for testing and toying around:

 docker run --name dev-ad --hostname ldap.schneide.dev --privileged -p 636:636 -e SMB_ADMIN_PASSWORD=admin123! -v $PWD/:/opt/ad-scripts -v $PWD/samba-data:/var/lib/samba dev-ad

To have everything running seamlessly you should add the specified hostname – ldap.schneide.dev in our example – to /etc/hosts so that all tools work as expected and like it was a real AD host somewhere.

Testing our setup

Now of course you may want to check if your development AD works as expected and maybe add some groups and users which you need for your implementation to work.

While there are a bunch of tools for working with an AD/LDAP I found the old and sturdy LdapAdmin the easiest and most straightforward to use. It comes as one self-contained executable file (downloadable from Sourceforge) ready to use without installation or other hassles.

After getting the container and LdapAdmin up and running and logging in you should see something like this below:

LdapAdmin Window showing our Samba AD

Then you can browse and edit your active directory to fit your needs allowing you to develop your authentication and authorization module based on LDAP.

I hope you found the above useful for you development setup.

The spell that reveals your onboarding decade

Every one of us has started somewhere. By telling you what my first computer was, I also convey a lot about the place and time my journey in IT started. For many of my fellows, it was a Commodore C64 or an Atari 500. But even if I don’t tell you about my first machine, there is a simple “magic spell” that you can cast to at least get a hint about the decade my first working days started, 15 years after my first contact with computers.

The spell is just one word: “container”. What a container is and how to use it is bound to the decades. Let me guide you through some typical answers.

Pre-2010 answer

If you entered the industry around the year 2000, a container was a big chunk of software that you preferably installed on an even bigger machine, the infamous “application server”. The container, or “servlet container”, “application container”, or, if you were with the right folks, “enterprise bean container” (in short: EJB-Container) was the central hub to host all of your web applications. If you deployed your application into the container, it handled the rest, like unpacking the web archive, providing resources and publishing to the internet. Typical names of containers were Tomcat, Jetty, JBoss or WildFly. You can probably see them around even today, because the concept itself is appealing. Some aspects of it inevitably lead to problems, though. Resource management was a big topic. Your application wasn’t expected to care for a database connection, a logging context or, sometimes, even security features, because the container provided those things to it. As you can probably imagine, that left your application crippled and unable to function outside a container.

Pre-2010 containers

So if you onboarded more than ten years ago, your first thoughts reacting to the word “container” will be “big machine”, “slow startup” and “logging framework”. There cannot reasonably be more than one container per machine. Maintaining a cluster of containers would be the work of luminaries. Being asked to start a container on your developer machine is a dreadful endeavour. “Booting the container” is a reason to visit the coffee machine.

Post-2010 answer

But if you started your career less than ten years ago, your reaction to the word “container” will be different. Starting in 2013, a technology named “Docker” reinvented an old practice to isolate processes and package them into a transport format. Simplified enough, a container is just the RAM-based projection of an application image. You boot a container by loading the image into RAM. That’s some of the fastest things you can do on a computer (not really, but it fits the story better). Even better, because each container ideally contains just one small application or part of it, you don’t boot one container per machine, you can run dozens at the same time. Each container brings everything it needs with it and only relies on three common external resources being provided: Networking, persistent storage and a facility to dump logging output.

Post-2010 containers

It is good practice to partition your application into several containers of the post-2010 kind. It is good practice to have them talk to each other over network, either real or simulated. The lines between actual computers get blurry real fast with this kind of containering.

As a youngster, your first thoughts reacting to the word “container” will be “just one?”, “scale up” and “log output management”. You see an opportunity to maintain a cluster of containers. Being asked to start a container on your developer machine is a no-brainer. “Booting the container” is a reason to automate your container infrastructure.

The reactions to the word “container” are very different, based on socialization period. In the old days, pre-2010 containers were boss fight adversaries. Nowadays, post-2010 containers are helpful spirits that just need to be controlled.

Post-2020 answer?

What better way to control the helpful spirits but to deploy them to an environment that handles unpacking, wiring, providing resources and publishing to the internet? Your application isn’t expected to care for topics like scalability, cluster robustness or load balancing. The environment, your container cluster platform, handles those things for you. There can only be one cluster platform per cloud. Being asked to start a cluster platform on your developer machine – well, that’s just not possible, sorry. Best we can do is a minified version of it. Our applications tend to function poorly outside a cluster platform.

As you hopefully can see, developers of all decades crave a thing they tend to call “container” that they can throw their software into to have it perform well without all the hassle of operations. But as soon as they give away responsibility for the environment, they also give away the possibility of comfortable “developer machine” operations. The goal is the same, just the technicality what exactly a “container” happens to be changes over time.

What is your “spell” that reveals a lot about the responder?

Docker runtime breaking your container

Docker (or container technology in general) is a great tool to clearly separate the concerns of developers and operations. We use it to simplify various tasks like building projects, packaging them for different platforms and deployment of our software onto the target machines like staging and production servers. All the specifics of the projects are contained and version controlled using the Dockerfiles and compose files.

Our operations only needs to provide some infrastructure able to build container images and run them. This works great most of the time and removes a lot of the friction between developers and operation where in the past snowflaky-servers needed to be setup and maintained. Developers often had to ask for specific setups and environments because each project had their own needs. That is all gone with this great container technology. Brave new world. Except when it suddenly does not work anymore.

Help, my deployment container stopped working!

As mentioned above we use docker to deploy our software to the target machines. These machines are often part of a corporate network protected by firewalls and only accessible using VPN. I already talked about how to use openvpn in a docker container for deployment. So the other day I was making a release of one of my long-running projects and pressing the deploy button for that project on our jenkins continuous integration server.

But instead of just leaning back, relaxing and watching the magic work the deployment failed and the red light lit up! A look into the job output showed that the connection to the target machine was refused. A quick check from the developer machine showed no problem on the receiving side. VPN, target machine and everything was up and running as usual.

After a quick manual deployment performed with care and administrator hat I went on an investigation journey…

What was going on?

The deployment job did not change for several months, the container image did not change and the rest of the infrastructure was working as expected. After more digging, debugging narrowing down the problem I found out, that openvpn did not work in the container anymore because of some strange permission denied error:

Tue May 19 15:24:14 2020 /sbin/ip addr add dev tap0 1xx.xxx.xxx.xxx/22 broadcast 1xx.xxx.xxx.xxx
Tue May 19 15:24:14 2020 /sbin/ip -6 addr add 2axx:1xxx:4:5xxx:9xx:5xxx:5xxx:4xxx/64 dev tap0
RTNETLINK answers: Permission denied
Tue May 19 15:24:14 2020 Linux ip -6 addr add failed: external program exited with error status: 2
Tue May 19 15:24:14 2020 Exiting due to fatal error

This hot trace made it easy to google for and revealed following issue on github: https://github.com/dperson/openvpn-client/issues/75. The cause of all the trouble was changed behaviour of the docker runtime. Our automatic updates had run over the weekend and actually installed a new package version of the docker runtime (see exerpt from apt history log):

containerd.io:amd64 (1.2.13-1, 1.2.13-2)

This subtle change broke my container! After some sacrifices to the whale gods I went on to implement the fix. Fortunately there is an easy way to get it working like before. You just have to pass following command line switch to docker run and everything works as expected:

--sysctl net.ipv6.conf.all.disable_ipv6=0

As nice as containers are for abstracting away hardware, operating systems and other environment details sometimes the container runtime shines through. It is just a shame that such things happen on minor releases or package release upgrades…

Containers allot responsibilities anew

Earlier this year, we experienced a strange bug with our invoices. We often add time tables of our work to the invoices and generate them from our time tracking tool. Suddenly, from one invoice to the other, the dates were wrong. Instead of Monday, the entry was listed as Sunday. Every day was shifted one day “to the left”. But we didn’t release a new version of any of the participating tools for quite some time.

What we did since the last invoice generation though was to dockerize the invoice generation tool. We deployed the same version of the tool into a docker container instead of its own virtual machine. This reduced the footprint of the tool and lowered our machine count, which is a strategic goal of our administrators.

By dockerizing the tool, we also unknowingly decoupled the timezone setting of the container and tool from the timezone setting of the host machine. The host machine is set to the correct timezone, but the docker container was set to UTC, being one hour behind the local timezone. This meant that the time table generation tool didn’t land at midnight of the correct day, but at 23 o’clock of the day before. Side note: If the granularity of your domain data is “days”, it is not advisable to use 00:00 o’clock as the reference time for your technical data. Use something like 12:00 o’clock or adjust your technical data to match the domain and remove the time aspect from your dates.

We needed to adjust the timezone of the docker container by installing the tzdata package and editing some configuration files. This was no big deal once we knew where the bug originated from. But it shows perfectly that docker (as a representative of the container technology) rearranges the responsibilities of developers and operators/administrators and partitions them in a clear-cut way. Before the dockerization, the timezone information was provided by the host and maintained by the administrator. Afterwards, it is provided by the container and therefore maintained by the developers. If containers are immutable service units, their creators need to accomodate for all the operation parameters that were part of the “environment” beforehands. And the environment is provided by the operators.

So we see one thing clearly: Docker and container technology per se partitions the responsibilities between developers and operators in a new way, but with a clear distinction: Everything is developer responsibility as long as the operators provide ports and volumes (network and persistent storage). Volume backup remains the responsibility of operations, but formatting and upgrading the volume’s content is a developer task all of a sudden. In a containerized world, the operators don’t know you are using a NoSQL database and they really don’t care anymore. It’s just one container more in the zoo.

I like this new partitioning of responsibilities. It assigns them for technical reasons, so you don’t have to find an answer in each organization anew. It hides a lot of detail from the operators who can concentrate on their core responsibilities. Developers don’t need to ask lots of questions about their target environment, they can define and deliver their target environment themselves. This reduces friction between the two parties, even if developers are now burdened with more decisions.

In my example from the beginning, the classic way of communication would have been that the developers ask the administrator/operator to fix the timezone on the production system because they have it right on all their developer machines. The new way of communication is that the timezone settings are developer responsibility and now the operator asks the developers to fix it in their container creation process. And, by the way, every developer could have seen the bug during development because the developer environment matches the production environment by definition.

This new partition reduces the gray area between the two responsibility zones of developers and operators and makes communication and coordination between them easier. And that is the most positive aspect of container technology in my eyes.

Using parameterized docker builds

Docker is a great addition to you DevOps toolbox. Sometimes you may want to build several similar images using the same Dockerfile. That’s where parameterized docker builds come in:

They provide the ability to provide configuration values at image build time. Do not confuse this with environment variables when running the container! We used parameterized builds for example to build images for creating distribution-specific packages of native libraries and executables.

Our use case

We needed to package some proprietary native programs for several linux distribution version, in our case openSuse Leap. Build ARGs allow us to use a single Dockerfile but build several images and run them to build the packages for each distribution version. This can be easily achieved using so-called multi-configuration jobs in  the jenkins continuous integration server. But let us take a look at the Dockerfile first:

ARG LEAP_VERSION=15.1
FROM opensuse/leap:$LEAP_VERSION
ARG LEAP_VERSION=15.1

# add our target repository
RUN zypper ar http://our-private-rpm-repository.company.org/repo/leap-$LEAP_VERSION/ COMPANY_REPO

# install some pre-requisites
RUN zypper -n --no-gpg-checks refresh && zypper -n install rpm-build gcc-c++

WORKDIR /buildroot

CMD rpmbuild --define "_topdir `pwd`" -bb packaging/project.spec

Notice the ARG instruction defines a parameter name and a default value. That allows us to configure the image at build time using the --build-arg command line flag. Now we can build a docker image for Leap 15.0 using a command like:

docker build -t project-build --build-arg LEAP_VERSION=15.0 -f docker/Dockerfile .

In our multi-configuration jobs we call docker build with the variable from the axis definition to build several images in one job using the same Dockerfile.

A gotcha

As you may have noticed we have put the same ARG instruction twice in the Dockerfile: once before the FROM instruction and another time after FROM. This is because the build args are cleared after each FROM instruction. So beware in multi-stage builds, too. For more information see the docker documentation and this discussion. This had cost us quite some time as it was not as clearly documented at the time.

Conclusion

Parameterized builds allow for easy configuration of your Docker images at image build time. This increases flexibility and reduces duplication like maintaining several almost identical Dockerfiles. For runtime container configuration provide environment variables  to the docker run command.

Ansible in Jenkins

Ansible is a powerful tool for automation of your IT infrastructure. In contrast to chef or puppet it does not need much infrastructure like a server and client (“agent”) programs on your target machines. We like to use it for keeping our servers and desktop machines up-to-date and provisioned in a defined, repeatable and self-documented way.

As of late ansible has begun to replace our different, custom-made – but already automated – deployment processes we implemented using different tools like ant scripts run by jenkins-jobs. The natural way of using ansible for deployment in our current infrastructure would be using it from jenkins with the jenkins ansible plugin.

Even though the plugin supports the “Global Tool Configuration” mechanism and automatic management of several ansible installations it did not work out of the box for us:

At first, the executable path was not set correctly. We managed to fix that but then the next problem arose: Our standard build slaves had no jinja2 (python templating library) installed. Sure, that are problems you can easily fix if you decide so.

For us, it was too much tinkering and snowflaking our build slaves to be feasible and we took another route, that you can consider: Running ansible from an docker image.

We already have a host for running docker containers attached to jenkins so our current state of deployment with ansible roughly consists of a Dockerfile and a Jenkins job to run the container.

The Dockerfile is as simple as


FROM ubuntu:14.04
RUN DEBIAN_FRONTEND=noninteractive apt-get update && apt-get -y dist-upgrade && apt-get -y install software-properties-common
RUN DEBIAN_FRONTEND=noninteractive apt-add-repository ppa:ansible/ansible-2.4
RUN DEBIAN_FRONTEND=noninteractive apt-get update && apt-get -y install ansible

# Setup work dir
WORKDIR /project/provisioning

# Copy project directory into container
COPY . /project

# Deploy the project
CMD ansible-playbook -i inventory deploy-project.yml

And the jenkins build step to actually run the deployment looks like


docker build -t project-deploy .
docker run project-deploy

That way we can tailor our deployment machine to conveniently run our ansible playbooks for the specific project without modifying our normal build slave setups and adding complexity on their side. All the tinkering with the jenkins ansible plugin is unnecessary going this way and relying on docker and what the container provides for running ansible.

Recap of the Schneide Dev Brunch 2015-08-09

If you couldn’t attend the Schneide Dev Brunch at 9th of August 2015, here is a summary of the main topics.

brunch64-borderedTwo weeks ago, we held another Schneide Dev Brunch, a regular brunch on the second sunday of every other (even) month, only that all attendees want to talk about software development and various other topics. So if you bring a software-related topic along with your food, everyone has something to share. The brunch was well attented this time with enough stuff to talk about. As usual, a lot of topics and chatter were exchanged. This recapitulation tries to highlight the main topics of the brunch, but cannot reiterate everything that was spoken. If you were there, you probably find this list inconclusive:

News on Docker

Docker is the hottest topic among developers and operators in 2015. No wonder we started chatting about it the minute we sat down. There are currently two interesting platform projects that provide runtime services for docker: Tutum (commercial) and Rancher (open source). We all noted the names and will check them out. The next interesting fact was that Docker is programmed in the Go language. The team probably one day decided to give it a go.

Air Conditioning

We all experienced the hot spell this summer and observed that work in the traditional sense is impossible beyond 30° Celsius. Why there are still so few air conditioned offices in our region is beyond our grasp. Especially since it’s possible to power the air condition system with green electricity and let sun-power deal with the problem that, well, the sun brought us. In 2015 alone, there are at minimum two work weeks lost to the heat. The productivity gain from cooling should outweigh the costs.

License Management

We talked about how different organisations deal with the challenge of software license management. Nearly every big company has a tool that does essentially the same license management but has its own cool name. Other than that, bad license management is such a great productivity killer that even air conditioning wouldn’t offset it.

Windows 10

Even if we are largely operation system agnostic, the release of Windows 10 is hot news. A few of our participants already tried it and concluded that “it’s another Windows”. A rather confusing aspect is the split system settings. And you have to abdicate the Cortana assistant if you want to avoid the data gathering.

Patch Management

A rather depressing topic was the discussion about security patches. I just repeat two highlights: A substantial number of servers on the internet are still vulnerable to the heartbleed attack. And if a car manufacturer starts a big recall campaign with cost-free replacements, less than 10 percent of the entitled cars are actually fixed on average. These explicitely includes safety-critical issues. That shouldn’t excuse us as an industry for our own shortcomings and it’s not reassuring to see that other industries face the same problems.

Self-Driving Cars

We disgressed on the future hype topic of self-driving cars. I can’t reiterate the complete discussion, but we agreed that those cars will hit the streets within the next ten years. The first use case will be freight transports, because the cargo doesn’t mind if the driver is absent and efficiency matters a lot in logistics. Plus, machines don’t need breaks. Ok, those were enough puns on the topic. Sorry.

Tests on Interfaces

An interesting question was how to build tests that can ensure a class or interface contract. Much like regression tests for recently broken functionality, compatibility tests should deal with backward compatibility issues in the interface. Turns out, the Eclipse foundation gave the topic some thoughts and came up with an exhaustive list of aspects to check. There are even some tools that might come in handy if you want to compare two versions of an API.

API Design

When the topic of API Design came up, some veterans of the Schneide Events immediately mentioned the API Design Fest we held in November 2013 to get our noses bloody on API design. Well, bleed we did. The most important take-away from the Fest was that if you plan to publish an API that can endure some years in production while being enhanced and improved, you just shouldn’t do it. Really, don’t do it, it’s probably a bad idea and you lack the required skill without even knowing it. If you want to know, participate or even host an API Design Fest.

And if you happen to design a web-based API, you might abandon backward compatibility by offering several distinct “versions” of APIs of a service. The version is included in the API URL, and acts more like a name than a version. This will ease your burden a bit. A nice reference resource might also be the PayPal API style guide.

Let’s just agree that API design is really hard and should not be done until it’s clear you don’t suffer from Dunning-Kruger effect symptoms too much.

Performance Tests

We talked about the most effective setup of performance tests. There were a lot of ideas and we cornerstoned the topic around this:

  • There was a nearly heroic effort from the Eclipse development team to measure their IDE performance, especially to compare different versions of the IDE. The Eclipse Test & Performance Tools Platform (TPTP) was (as in: discontinued) a toolkit of interesting approaches to the topic. The IDE itself was measured by performance fingerprints like this example from 2011. As far as we know, all those things ceased to exist.
  • At the last Java Forum Stuttgart, there was a talk about performance testing from an experienced tester that loved to give specific advice. The slides can be viewed online in german language (well, not really, but the talk was).
  • The book Release It! has a lot of insights to this topic. It’s one of the bigger books on the pragmatic bookshelf.
  • The engineers at NetFlix actually did a lot of thinking about the topic. They came up with Hystrix, a resilience library, aimed to make it easier to prevent complete system blackouts. They also came up with Chaos Monkey, a service that makes it easier to have a complete system blackout. If we can say anything about NetFlix, it is that they definitely approach their problems from the right angle.

Company Culture

Leaking over from the previous topic about effective performance-related measures, we talked about different company cultures, especially in regard to a centralized human resources departments and works council (german: Betriebsrat). We agreed that it is very difficult to maintain a certain culture and continued growth. We also agreed that culture trickles down from top management.

OpenGL

The last topic on this Dev Brunch was about the rendering of text or single characters in OpenGL. By using signed distance fields, you can render text more crisp and still only use cheap computation instructions. There is a paper from Valve on the topic that highlights the benefits and gives a list of additional reading. It’s always cool to learn about something simple that actually improves things.

Epilogue

As usual, the Dev Brunch contained a lot more chatter and talk than listed here. The number of attendees makes for an unique experience every time. We are looking forward to the next Dev Brunch at the Softwareschneiderei. And as always, we are open for guests and future regulars. Just drop us a notice and we’ll invite you over next time.