Ansible for deployment with docker

We are using Jenkins not only for our continuous integration needs but also for running deployments at the push of a button. Back in the dark times™ this often meant using Apache Ant in combination with JSch to copy the projects artifacts to some target machine and execute some remote commands over ssh.
After gathering some experience with Ansible and docker we took a new shot at the task and ended up with one-shot docker containers executing some ansible scripts. The general process stayed the same but the new solution is more robust in general and easier to maintain.
In addition to the project- and deployment-process-specific simplifications we can now use any jenkins slave with a docker installation and network access. No need for an ant installation and a java runtime anymore.
However there are some quirks you have to overcome to get the whole thing running. Let us see how we can accomplish non-interactive (read: fully automated) deployments using the tools mentioned above.

The general process

The deployment job in Jenkins has to perform the following steps:

  1. Fetch the artifacts to deploy from somewhere, e.g. a release job
  2. Build the docker image provisioned with ansible
  3. Run a container of the image providing credentials and artifacts through the environment and/or volumes

The first step can easily be implemented using the Copy Artifact Plugin.

The Dockerfile

You can use mostly any linux base image for your deployment needs as long as the distribution comes with ansible. I chose Ubunt 18.04 LTS because it is sufficiently up-to-date and we are quite familiar with it. In addition to ssh and ansible we only need the command for running the playbook. In our case the the playbook and ssh-credentials reside in the deployment folder of our projects, of course encrypted in an ansible vault. To be able to connect to the target machine we need the vault password to decrypt the ssh password or private key. Therefore we inject the vault password as an environment variable when running the container.

A simple Dockerfile may look as follows:

FROM ubuntu:18.04
RUN DEBIAN_FRONTEND=noninteractive apt-get update
RUN DEBIAN_FRONTEND=noninteractive apt-get -y dist-upgrade
RUN DEBIAN_FRONTEND=noninteractive apt-get -y install software-properties-common
RUN DEBIAN_FRONTEND=noninteractive apt-get update && apt-get -y install ansible ssh

SHELL ["/bin/bash", "-c"]
# A place for our vault password file
RUN mkdir /ansible/

# Setup work dir
WORKDIR /deployment

COPY deployment/${TARGET_HOST} .

# Deploy the proposal submission system using openvpn and ansible
CMD echo -e "${VAULT_PASSWORD}" > /ansible/vault.access && \
ansible-vault decrypt --vault-password-file=/ansible/vault.access ${TARGET_HOST}/credentials/deployer && \
ansible-playbook -i ${TARGET_HOST}, -u root --key-file ${TARGET_HOST}/credentials/deployer ansible/deploy.yml && \
ansible-vault encrypt --vault-password-file=/ansible/vault.access ${TARGET_HOST}/credentials/deployer && \
rm -rf /ansible

Note that we create a temporary vault password file using the ${VAULT_PASSWORD} environment variable for ansible vault decryption. A second noteworthy detail is the ad-hoc inventory using the ${TARGET_HOST} environment variable terminated with comma. Using this variable we also reference the correct credential files and other host-specific data files. That way the image will not contain any plain text credentials.

The Ansible playbook

The playbook can be quite simple depending on you needs. In our case we deploy a WAR-archive to a tomcat container and make a backup copy of the old application first:

---
- hosts: all
  vars:
    artifact_name: app.war
    webapp_directory: /var/lib/tomcat8/webapps
  become: true

  tasks:
  - name: copy files to target machine
    copy: src={{ item }} dest=/tmp
    with_fileglob:
    - "/artifact/{{ artifact_name }}"
    - "../{{ inventory_hostname }}/context.xml"

  - name: stop our servlet container
    systemd:
      name: tomcat8
      state: stopped

  - name: make backup of old artifact
    command: mv "{{ webapp_directory }}/{{ artifact_name }}" "{{ webapp_directory }}/{{ artifact_name }}.bak"

  - name: delete exploded webapp
    file:
      path: "{{ webapp_directory }}/app"
      state: absent

  - name: mv artifact to webapp directory
    copy: src="/tmp/{{ artifact_name }}" dest="{{ webapp_directory }}" remote_src=yes
    notify: restart app-server

  - name: mv context file to container context
    copy: src=/tmp/context.xml dest=/etc/tomcat8/Catalina/localhost/app.xml remote_src=yes
    notify: restart app-server

  handlers:
  - name: restart app-server
    systemd:
      name: tomcat8
      state: started

The declarative syntax and the use of handlers make the process quite straight-forward. Of note here is the - hosts: all line that is used with our command line ad-hoc inventory. That way we do not need an inventory file in our project repository or somewhere else. The host can be provided by the deployment job. That brings us to the Jenkins-Job that puts everything together.

The Jenkins job

As you probably already noticed in the parts above there are several inputs required by the container and the ansible script:

  1. A volume with the artifacts to deploy
  2. A volume with host specific credentials and data files
  3. Vault credentials
  4. The target host(s)

We inject passwords using the EnvInject Plugin which stored them encrypted and masks them in all the logs. I like to prefix such environmental settings in Jenkins with JOB_ do denote their origin. After the copying build step which puts the artifacts into the artifact directory we can use a small shell script to build the image and run the container with all the needed stuff wired up. The script may look like the following:

docker build -t app-deploy -f deployment/docker/Dockerfile .
docker run --rm \
-e VAULT_PASSWORD=${JOB_VAULT_PASSWORD} \
-e TARGET_HOST=${JOB_TARGET_HOST} \
-e ANSIBLE_HOST_KEY_CHECKING=False \
-v `pwd`/artifact:/artifact \
app-deploy

DOCKER_RUN_RESULT=`echo $?`
exit $DOCKER_RUN_RESULT

We run the container with the --rm flag to delete it right after execution leaving nothing behind. We also disable host key checking for ansible to keep the build non-interactive. The last two lines let the jenkins job report the result to the container execution.

Wrapping it up

Essentially we have a deployment consisting of three parts. A docker image definition of the required environment, an ansible playbook performing the real work and a jenkins job with a small shell script wiring it all together so that it can be executed at the push of a button. We do not store any clear text credentials persistently at neither location, so that we are reasonably secure for running the deployment.

Using Ansible vault for sensitive data

We like using ansible for our automation because it has minimum requirements for the target machines and all around infrastructure. You need nothing more than ssh and python with some libraries. In contrast to alternatives like puppet and chef you do not need special server and client programs running all the time and communicating with each other.

The problem

When setting up remote machines and deploying software systems for your customers you will often have to use sensitive data like private keys, passwords and maybe machine or account names. On the one hand you want to put your automation scripts and their data under version control and use them from your continuous integration infrastructure. On the other hand you do not want to spread the secrets of your customers all around your infrastructure and definately never ever in your source code repository.

The solution

Ansible supports encrypting sensitive data and using them in playbooks with the concept of vaults and the accompanying commands. Setting it up requires some work but then usage is straight forward and works seamlessly.

The high-level conversion process is the following:

  1. create a directory for the data to substitute on a host or group basis
  2. extract all sensitive variables into vars.yml
  3. copy vars.yml to vault.yml
  4. prefix variables in vault.yml with vault_
  5. use vault variables in vars.yml

Then you can encrypt vault.yml using the ansible-vault command providing a password.

All you have to do subsequently is to provide the vault password along with your usual playbook commands. Decryption for playbook execution is done transparently on-the-fly for you, so you do not need to care about decryption and encryption of your vault unless you need to update the data in there.

The step-by-step guide

Suppose we want work on a target machine run by your customer but providing you access via ssh. You do not want to store your ssh user name and password in your repository but want to be able to run the automation scripts unattended, e.g. from a jenkins job. Let us call the target machine ceres.

So first you setup the directory structure by creating a directory for the target machine called $ansible_script_root$/host_vars/ceres.

To log into the machine we need two sensitive variables: ansible_user and ansible_ssh_pass. We put them into a file called $ansible_script_root$/host_vars/ceres/vars.yml:

ansible_user: our_customer_ssh_account
ansible_ssh_pass: our_target_machine_pwd

Then we copy vars.yml to vault.yml and prefix the variables with vault_ resulting in $ansible_script_root$/host_vars/ceres/vault.yml with content of:

vault_ansible_user: our_customer_ssh_account
vault_ansible_ssh_pass: our_target_machine_pwd

Now we use these new variables in our vars.xml like this:

ansible_user: "{{ vault_ansible_user }}"
ansible_ssh_pass: "{{ vault_ansible_ssh_pass }}"

Now it is time to encrypt the vault using the command

ANSIBLE_VAULT_PASS="ourpwd" ansible-vault encrypt host_vars/ceres/vault.yml

resulting a encrypted vault that can be put in source control. It looks something like

$ANSIBLE_VAULT;1.1;AES256
35323233613539343135363737353931636263653063666535643766326566623461636166343963
3834323363633837373437626532366166366338653963320a663732633361323264316339356435
33633861316565653461666230386663323536616535363639383666613431663765643639383666
3739356261353566650a383035656266303135656233343437373835313639613865636436343865
63353631313766633535646263613564333965343163343434343530626361663430613264336130
63383862316361363237373039663131363231616338646365316236336362376566376236323339
30376166623739643261306363643962353534376232663631663033323163386135326463656530
33316561376363303339383365333235353931623837356362393961356433313739653232326638
3036

Using your playbook looks similar to before, you just need to provide the vault password using one of several options like specifying a password file, environment variable or interactive input. In our example we just use the environment variable inline:

ANSIBLE_VAULT_PASS="ourpwd" ansible-playbook -i inventory work-on-customer-machines.yml

After setting up your environment appropriately with a password file and the ANSIBLE_VAULT_PASSWORD_FILE environment variable your playbook commands are exactly the same like without using a vault.

Conclusion

The ansible vault feature allows you to safely store and use sensitive data in your infrastructure without changing too much using your automation scripts.

Ansible in Jenkins

Ansible is a powerful tool for automation of your IT infrastructure. In contrast to chef or puppet it does not need much infrastructure like a server and client (“agent”) programs on your target machines. We like to use it for keeping our servers and desktop machines up-to-date and provisioned in a defined, repeatable and self-documented way.

As of late ansible has begun to replace our different, custom-made – but already automated – deployment processes we implemented using different tools like ant scripts run by jenkins-jobs. The natural way of using ansible for deployment in our current infrastructure would be using it from jenkins with the jenkins ansible plugin.

Even though the plugin supports the “Global Tool Configuration” mechanism and automatic management of several ansible installations it did not work out of the box for us:

At first, the executable path was not set correctly. We managed to fix that but then the next problem arose: Our standard build slaves had no jinja2 (python templating library) installed. Sure, that are problems you can easily fix if you decide so.

For us, it was too much tinkering and snowflaking our build slaves to be feasible and we took another route, that you can consider: Running ansible from an docker image.

We already have a host for running docker containers attached to jenkins so our current state of deployment with ansible roughly consists of a Dockerfile and a Jenkins job to run the container.

The Dockerfile is as simple as


FROM ubuntu:14.04
RUN DEBIAN_FRONTEND=noninteractive apt-get update && apt-get -y dist-upgrade && apt-get -y install software-properties-common
RUN DEBIAN_FRONTEND=noninteractive apt-add-repository ppa:ansible/ansible-2.4
RUN DEBIAN_FRONTEND=noninteractive apt-get update && apt-get -y install ansible

# Setup work dir
WORKDIR /project/provisioning

# Copy project directory into container
COPY . /project

# Deploy the project
CMD ansible-playbook -i inventory deploy-project.yml

And the jenkins build step to actually run the deployment looks like


docker build -t project-deploy .
docker run project-deploy

That way we can tailor our deployment machine to conveniently run our ansible playbooks for the specific project without modifying our normal build slave setups and adding complexity on their side. All the tinkering with the jenkins ansible plugin is unnecessary going this way and relying on docker and what the container provides for running ansible.

Gradle projects as Debian packages

Gradle is a great tool for setting up and building your Java projects. If you want to deliver them for Ubuntu or other debian-based distributions you should consider building .deb packages. Because of the quite steep learning curve of debian packaging I want to show you a step-by-step guide to get you up to speed.

Prerequisites

You have a project that can be built by gradle using gradle wrapper. In addition you have a debian-based system where you can install and use the packaging utilities used to create the package metadata and the final packages.

To prepare the debian system you have to install some packages:

sudo apt install dh-make debhelper javahelper

Generating packaging infrastructure

First we have to generate all the files necessary to build full fledged debian packages. Fortunately, there is a tool for that called dh_make. To correctly prefill the maintainer name and e-mail address we have to set 2 environment variables. Of course, you could change them later…

export DEBFULLNAME="John Doe"
export DEBEMAIL="john.doe@company.net"
cd $project_root
dh_make --native -p $project_name-$version

Choose “indep binary” (“i”) as type of package because Java is architecture indendepent. This will generate the debian directory containing all the files for creating .deb packages. You can safely ignore all of the files ending with .ex as they are examples features like manpage-generation, additional scripts pre- and post-installation and many other aspects.

We will concentrate on only two files that will allow us to build a nice basic package of our software:

  1. control
  2. rules

Adding metadata for our Java project

In the control file fill all the properties if relevant for your project. They will help your users understand what the package contains and whom to contact in case of problems. You should add the JRE to depends, e.g.:

Depends: openjdk-8-jre, ${misc:Depends}

If you have other dependencies that can be resolved by packages of the distribution add them there, too.

Define the rules for building our Java project

The most important file is the rules makefile which defines how our project is built and what the resulting package contents consist of. For this to work with gradle we use the javahelper dh_make extension and override some targets to tune the results. Key in all this is that the directory debian/$project_name/ contains a directory structure with all our files we want to install on the target machine. In our example we will put everything into the directory /opt/my_project.

#!/usr/bin/make -f
# -*- makefile -*-

# Uncomment this to turn on verbose mode.
#export DH_VERBOSE=1

%:
	dh $@ --with javahelper # use the javahelper extension

override_dh_auto_build:
	export GRADLE_USER_HOME="`pwd`/gradle"; \
	export GRADLE_OPTS="-Dorg.gradle.daemon=false -Xmx512m"; \
	./gradlew assemble; \
	./gradlew test

override_dh_auto_install:
	dh_auto_install
# here we can install additional files like an upstart configuration
	export UPSTART_TARGET_DIR=debian/my_project/etc/init/; \
	mkdir -p $${UPSTART_TARGET_DIR}; \
	install -m 644 debian/my_project.conf $${UPSTART_TARGET_DIR};

# additional install target of javahelper
override_jh_installlibs:
	LIB_DIR="debian/my_project/opt/my_project/lib"; \
	mkdir -p $${LIB_DIR}; \
	install lib/*.jar $${LIB_DIR}; \
	install build/libs/*.jar $${LIB_DIR};
	BIN_DIR="debian/my_project/opt/my_project/bin"; \
	mkdir -p $${BIN_DIR}; \
	install build/scripts/my_project_start_script.sh $${BIN_DIR}; \

Most of the above should be self-explanatory. Here some things that cost me some time and I found noteworthy:

  • Newer Gradle version use a lot memory and try to start a daemon which does not help you on your build slaves (if using a continous integration system)
  • The rules file is in GNU make syntax and executes each command separately. So you have to make sure everything is on “one line” if you want to access environment variables for example. This is achieved by \ as continuation character.
  • You have to escape the $ to use shell variables.

Summary

Debian packaging can be daunting at first but using and understanding the tools you can build new packages of your projects in a few minutes. I hope this guide helps you to find a starting point for your gradle-based projects.

Debian packaging against the rules

In a former post I talked about packaging your own software in the most convenient and natural way for the target audience. Think of a MSI or .exe installer for Microsoft Windows, distribution specific packages for Linux (maybe even by providing own repositories) or smartphone apps via the standard app stores. In the case of Debian packages there are quite strict rules about filesystem layout, licensing and signatures. This is all fine if you want to get your software upstream into official repositories.

If you are developing commercial software for specific clients things may be different! I suggest doing what serves the clients user experience (UX) best even in regard to packaging for debian or linux.

Packaging for your users

Packaging for Linux means you need to make sure that your dependencies and versioning are well defined. If you miss out here problems will arise in updating your software. Other things you may consider even if they are against the rules

  • Putting your whole application with executables, libraries, configuration and resources under the same prefix, e.g. /opt/${my_project} or /usr/local/${my_project}. That way the user finds everything in one place instead of scattered around in the file system.
    • On debian this has some implication like the need to use the conffiles-feature for your configuration
  • Package together what belongs together. Often times it has no real benefit to split headers, libraries, executables etc. into different packages. Fewer packages makes it easier for the clients to handle.
  • Provide integration with operating system facilities like systemd or the desktop. Such a seamless integration eases use and administration of your software as no “new tricks” have to be learned.
    • A simple way for systemd is a unit file that calls an executable with an environment file for configuration
  • Adjust the users path or put links to your executables in well known directories like /usr/bin. Running your software from the command line should be easy and with sensible defaults. Show sample usages to the user so they can apply “monkey see – monkey do”.

Example of a unit file:

[Unit]
Description=My Server

[Service]
EnvironmentFile=/opt/my_project/my-server.env
ExecStart=/usr/bin/my-server

[Install]
WantedBy=multi-user.target

In the environment file you can point to other configuration files like XML configs or the like if need be. Environment variables in general are a quite powerful way to customize behaviour of a program on a per-process base, so make sure your start scripts or executables support them for manual experimentation, too.

Possible additional preparations

If you plan to deliver your packages without providing an own repository and want to enable your clients to install them easily themselves you can further aid them.

If the target machines are few and can easily be prepared by you, install tools like gdebi that allow installation using double click and a graphical interface.

If the target machines are numerous implement automation with tools like ansible and ensure unattended installation/update procedures.

Point your clients to easy tools they are feeling comfortable with. That could of course be a command line utility like aptitude, too.

What to keep in mind

There is seldom a one-size-fits-all in custom software. Do what fits the project and your target audience best. Do not fear to break some rules if it improves the overall UX of your service.

Packaging Python projects for Debian/Ubuntu

Deployment of software using built-in software management tools is very convenient and provides a nice user experience (UX) for the users. For debian-based linux distributions like Ubuntu packaging software in .deb-packages is the way to go. So how can we prepare our python projects for packaging as a deb-package? The good news is that python is supported out-of-the-box in the debian package build system.

Alternatively, you can use the distutils-extension stdeb if you do not need complete flexibility in creating the packages.

Basic python deb-package

If you are using setuptools/distutils for your python project debian packaging consists of editing the package metadata and adding --with python to the rules file. For a nice headstart we can generate templates of the debian metadata files using two simple commands (the debhelper package is needed for dh_make:

# create a tarball with the current project sources
python setup.py sdist
# generate the debian package metadata files 
dh_make -p ${project_name}_${version} -f dist/${project_name}-${version}.tar.gz 

You have to edit at least the control-file, the changelog and the rules-file to build the python package. In the rules-file the make-target % is the crucial point and should include the flag to build a python project:

# main packaging script based on dh7 syntax
%:
	dh $@ --with python

After that you can build the package issueing dpkg-buildpackage.

The caveats

The debian packaging system is great in complaining about non-conformant aspects of your package. It demands digital signatures, correct file and directory names including version strings etc. Unfortunately it is not very helpful when you make packaging  mistakes resulting in empty, incomplete or broken packages.

Issues with setup.py

The setup.py build script has to reside on the same level as the debian-directory containing the package metadata. The packaging tools will not tell you if they could not find the setup script. In addition it will always run setup.py using python 2, even if you specified --with python3 in the rules-file.

Packaging for specific python versions

If you want better control over the target python versions for the package you should use Pybuild. You can do this by a little change to the rules-file, e.g. a python3-only build using Pybuild:

# main packaging script based on dh7 syntax
%:
	dh $@ --with python3 --buildsystem=pybuild

For pybuild to work it is crucial to add the needed python interpreter(s) besides the mandatory build dependency dh-python to the Build-Depends of the control-file, for python3-only it could look like this:

Build-Depends: debhelper (>=9), dh-python, python3-all
...
Depends: ${python3:Depends}

Without the dh-python build dependency pybuild will silently do nothing. Getting the build dependencies wrong will create incomplete or broken packages. Take extra care of getting this right!

Conclusion

Debian packaging looks quite intimidating at first because there are so many ways to build a package. Many different tools can ease package creation but also add confusion. Packaging python software is done easily if you know the quirks. The python examples from the Guide for Debian Maintainers are certainly worth a look!

Self-contained projects in python

An important concept for us is the notion of self-containment. For a project in development this means you find everything you need to develop and run the software directly in the one repository you check out/clone. For practical reasons we most of the time omit the IDE and the basic runtime like Java JDK or the Python interpreter. If you have these installed you are good to go in seconds.

What does this mean in general?

Usually this means putting all your dependencies either in source or object form (dll, jar etc.) directly in a directory of your project repository. This mostly rules out dependency managers like maven. Another not as obvious point is to have hardware dependencies mocked out in some way so your software runs without potentially unavailable hardware attached. The same is true for software services somewhere on the net that may be unavailable, like a payment service for example.

How to do it for Python

For Python projects this means not simply installing you dependencies using the linux package manager, system-wide pip or other dependency management tools but using a virtual environment. Virtual environments are isolated Python environments using an available, but defined Python interpreter on the system. They can be created by the tool virtualenv or since Python 3.3 the included tool venv. You can install you dependencies into this environment e.g. using pip which itself is part of the virtualenv. Preparing a virtual env for your project can be done using a simple shell script like this:

python2.7 ~/my_project/vendor/virtualenv-15.1.0/virtualenv.py ~/my_project_env
source ~/my_project_env/bin/activate
pip install ~/my_project/vendor/setuptools_scm-1.15.0.tar.gz
pip install ~/my_project/vendor/six-1.10.0.tar.gz
...

Your dependencies including virtualenv (for Python installations < 3.3) are stored into the projects source code repository. We usually call the directory vendor or similar.

As a side note working with such a virtual env even remotely work like charm in the PyCharm IDE by selecting the Python interpreter of the virtual env. It correctly shows all installed dependencies and all the IDE support for code completion and imports works as expected:

python-interpreter-settings

What you get

With such a setup you gain some advantages missing in many other approaches:

  • No problems if the target machine has no internet access. This would be problematic to classical pip/maven/etc. approaches.
  • Mostly hassle free development and deployment. No more “downloading the internet” feeling or driver/hardware installation issues for the developer. A deployment is in the most simple cases as easy as a copy/rsync.
  • Only minimal requirements to the base installation of developer, build, deployment or other target machines.
  • Perfectly reproducable builds and tests in isolation. You continuous integration (CI) machine is just another target machine.

What it costs

There are costs of this approach of course but in our experience the benefits outweigh them by a great extent. Nevertheless I want to mention some downsides:

  • Less tool support for managing the dependencies, especially if your are used to maven and friends and happen to like them. Pip can work with local archives just fine but updating is a bit of manual work.
  • Storing (binary) dependencies in your repository increases the checkout size. Nowadays disk space and local network speeds make mostly irrelevant, especially in combination with git. Shallow-clones can further mitigate the problem.
  • You may need to put in some effort for implementing mocks for your hardware or third-party software services and a mechanism for switching between simulation and the real stuff.

Conclusion

We have been using self-containment to great success in varying environments. Usually, both developers and clients are impressed by the ease of development and/or installation using this approach regardless if the project is in Java, C++, Python or something else.