What is it with Software Development and all the clues to manage things?

As someone who started programming a long time ago (roughly 20 years, now that I think about it), but only in recent years entered the world of real software development, the mastery of day-to-day-challenges happens to consist of two main topics: First, inour rapidly evolving field we never run out of new technologies to learn, and then, there’s a certain engineering aspect underlying, how to do things in a certain manner, with lots of input every year.

So after I recently shared some of such ideas with my friends — I indeed still have a few ones of those), I wondered: How is it, that in the modern software development world, most of the information about managing things actually comes from the field itself, and rather feeding back its ideas of project management, quality, etc. into the non-software-subspaces of the world? (Ideas like the Agile movement, Software Craftmanship, the calls of doing things Lean and Clean, nowadays prospered so much that you see their application or modification in several other industries. Like advertisement, just as an example.)

I see a certain kind of brain food in this question. What tells software development apart from other current fields, so that there is a broad discussion and considerable input at its base level? After all, if you plan on becoming someone who builds houses, makes cars, or manages cities, you wouldn‘t engage in such a vivid culture of „how“ to do things, rather focussing on the „what“.

Of course, I might be mistaken in this view. But, by asking: what actually tells software development apart from these other fields of producing something, I see a certain kind of brain food, helpful for approaching every day tasks and valuing better tips over worse ones.

So, what can that be?

1. Quite peculiar is the low entry threshold in being able to call yourself a “programmer”. With the lots of resources you get at relatively little cost (assumption: you have a computer with a working internet connection), you have a lot of channels by which you can learn the „what“ of software development first, and saving the „how“ for later. If you plan on building a house, there‘s not a bazillion of books, tutorials, and videos, after all.

2. Similarly, there‘s the rather low cost of failure when drafting a quick hobby project. Not always will a piece of code that you write in your free time tell yourself „hey man, you ever thought about some better kind of architecture?“ – which is, why bad habits can stick and even feel right. If you choose the „wrong“ mindset, you don‘t always lose heaps of money, and neither do you, if you switch your strategy once in a while, you also don‘t automatically. (you probably will, though, if you are too careless in this process).

3. Furthermore, there‘s the dynamic extension of how your project is going to be used („Scope Creep“). One would build a skyscraper in a different way than a bungalow (I‘m not an expert, though), but with software, it often feels like adding a simple feature here, extending the scope there, unless you hit a point where all its interdependencies are in a complex state of conflict…

4. Then, it‘s a matter of transparency: If you sit in a badly designed car, it becomes rather obvious when it always exhausts clouds of black smoke. Or your house always smells like scents of fresh toilet. Of course, a well designed piece of software will come with a great user experience, but as you can see in many commercial products, there also is quite some presence of low-than-average-but-still-somehow-doing-what-it-should software. Probably users are more tolerant with software than with cars?

5. Also, as in most technical fields, it is not the case that „pure consultants“ are widely received in a positive light. For most nerds, you don‘t get a lot of credibility if you talk about best practices without having got your hands dirty over a longer period of time. Ergo, it needs some experienced software programmers in order to advise less experienced software programmers… but surely, it‘s questionable whether this is a good thing.

6. After all, the requirements for someone who develops a project might be very different in each field. From my academical past in computational Physics, I know that there is quite some demand for „quick & dirty“ solutions. Need to add some Dark Matter in your model here? Well, plug this formula in and check the results. Not every user has the budget or liberty in creating a solid structure of your program. If you want to have a new laboratory building, of course, you very well want it it do be designed as good as it can get.

All in all, these observations somehow boil down to the question, whether software development is to be seen more like a set of various engineering skills, rather like a handcraft, an art, or a complex program of study. It is the question, whether the “crack” in this field is the one who does complex arithmetics in its head, or the one who just gets what the customer wants. I like thinking about such peculiar modes of thought, as they help me in understanding what kinds of things I should learn next.

Or is there something completely else to it?

Compiling Agda 2.6.2 on Fedora 32

Agda is a dependently typed functional programming language that I like very much. Its latest versions have some special features which are not supported by any more well known languages, for example higher inductive types. Since I want to use all the special features of Agda, I regularly compile the latest version. This is a procedure which comes with a few surprises more often than not, so this post is about saving you the time it took me to figure out what to do.

I like to compile Agda using the Haskell tool Stack, which can be installed with

curl -sSL https://get.haskellstack.org/ | sh

The sources of Agda have to be checked out with all submodules – otherwise there will be some weird “cabal” (another Haskell tool, more basic than Stack) errors which I was not able to understand. This can be done with:

git clone --recurse-submodules https://github.com/agda/agda.git

Now if you go to the Agda folder

cd agda

There should be files called “stack-8.8.4.yaml” and similar. Those files can can tell Stack how to build Agda. The number is the version of the Haskell compiler which is used to build Agda. I usually use the latest, you do not have to figure out which (if any) is installed on your system, since Stack will just download the appropriate compiler for you.

However, just running stack failed for my on Fedora 32 due to some linking problem in the end. It turned out, that “libtinfo.so” was not found by the linker. “libtinfo.so.6” is available on Fedora 32, so adding a link to it fixed the problem:

sudo ln -s /usr/lib64/libtinfo.so.6 /usr/lib64/libtinfo.so

Now, you can tell Stack to get all necessary things and compile and install Agda with:

stack install --stack-yaml stack-8.8.4.yaml

This also installs a binary into “~/.local/bin” which is in PATH on Fedora 32 by default, so you should be able to call agda from the command line. Also, you can use “agda-mode” to configure emacs for agda.

Be precise, round twice

Recently after implementing a new feature in a software that outputs lots of floating point numbers, I realized that the last digits were off by one for about one in a hundred numbers. As you might suspect at this point, the culprit was floating point arithmetic. This post is about a solution, that turned out to surprisingly easy.

The code I was working on loads a couple of thousands numbers from a database, stores all the numbers as doubles, does some calculations with them and outputs some results rounded half-up to two decimal places. The new feature I had to implement involved adding constants to those numbers. For one value, 0.315, the constant in one of my test cases was 0.80. The original output was “0.32” and I expected to see “1.12” as the new rounded result, but what I saw instead was “1.11”.

What happened?

After the fact, nothing too surprising – I just hit decimals which do not have a finite representation as a binary floating point number. Let me explain, if you are not familiar with this phenomenon: 1/3 happens to be a fraction which does not have a finte representation as a decimal:

1/3=0.333333333333…

If a fraction has a finite representation or not, depends not only on the fraction, but also on the base of your numbersystem. And so it happens, that some innocent looking decimal like 0.8=4/5 has the following representation with base 2:

4/5=0.1100110011001100… (base 2)

So if you represent 4/5 as a double, it will turn out to be slightly less. In my example, both numbers, 0.315 and 0.8 do not have a finite binary representation and with those errors, their sum turns out to be slightly less than 1.115 which yields “1.11” after rounding. On a very rough count, in my case, this problem appeared for about one in a hundred numbers in the output.

What now?

The customer decided that the problem should be fixed, if it appears too often and it does not take to much time to fix it. When I started to think about some automated way to count the mistakes, I began to realize, that I actually have all the information I need to compute the correct output – I just had to round twice. Once say, at the fourth decimal place and a second time to the required second decimal place:

(new BigDecimal(0.8d+0.315d))
    .setScale(4, RoundingMode.HALF_UP)
    .setScale(2, RoundingMode.HALF_UP)

Which produces the desired result “1.12”.

If doubles are used, the errors explained above can only make a difference of about 10^{-15}, so as long as we just add a double to a number with a short decimal representation while staying in the same order of magnitude, we can reproduce the precise numbers from doubles by setting the scale (which amounts to rounding) of our double as a BigDecimal.

But of course, this can go wrong, if we use numbers, that do not have a short neat decimal representation like 0.315. In my case, I was lucky. First, I knew that all the input numbers have a precision of three decimal places. There are some calculations to be done with those numbers. But: All numbers are roughly in the same order of magnitude and there is only comparing, sorting, filtering and the only honest calculation is taking arithmetic means. And the latter only means I had to increase the scale from 4 to 8 to never see any error again.

So, this solution might look a bit sketchy, but in the end it solves the problem with the limited time budget, since the only change happens in the output function. And it can also be a valid first step of a migration to numbers with managed precision.

Math development practices

As a mathematician that recently switched to almost full-time software developing, I often compare the two fields. During the last years of my mathematics career I was in the rather unique position of doing both at once – developing software of some sort and research in pure mathematics. This is due to a quite new mathematical discipline called Homotopy Type Theory, which uses a different foundation as the mathematics you might have learned at a university. While it has been a possibility for quite some time to check formalized mathematics using computers, the usual way to do this entails a crazy amount of work if you want to use it for recent mathematics. By some lucky coincidences this was different for my area of work and I was able to write down my math research notes in the functional language Agda and have them checked for correctness.

As a disclaimer, I should mention that what I mean with “math” in this post, is very far from applied mathematics and very little of the kind of math I talk about is implement in computer algebra systems. So this post is about looking at pure, abstract math, as if it were a software project. Of course, this comparison is a bit off from the start, since there is no compiler for the math written in articles, but it is a common believe, that it should always be possible to translate correct math to a common foundation like the Zermelo Fraenkel Set Theory and that’s at least something we can type check with software (e.g. isabelle).

Refactoring is not well supported in math

In mathematics, you want to refactor what you write from time to time pretty much the same way and for the same reasons as you would while developing software. The problem is, you do not have tools which tell you immediately if your change introduces bugs, like automated tests and compilers checking your types.

Most of the time, this does not cause problems, judging from my experiences with refactoring software, most of the time a refactoring breaks something detected by a test or the compiler, it is just about adjusting some details. And, in fact, I would conjecture that almost all math articles have exactly those kind of errors – which is no problem at all, since the mathematicians reading those articles can fix them or won’t even notice.

As with refactoring in software development, what does matter are the rare cases where it is crucial that some easy to overlook details need to match exactly. And this is a real issue in math – sometimes a statements gets reformulated while proving it and the changes are so subtle that you do not even realize you have to check if what you prove still matches your original problem. The lack of tools that help you to catch those bugs is something that could really help math – but it has to be formalized to have tools like that and that’s not feasible so far for most math.

Retrospectively, being able to refactor my math research was the biggest advantage of having fully formal research notes in Agda. There is no powerful IDE like they are used in mainstream software development, just a good emacs-mode. But being able to make a change and check afterwards if things still compile, was already enough to enable me to do things I would not have done in pen and paper math.

Not being able to refactor might also be the root cause of other problems in math. One wich would be really horrible for a software project, is that sometimes important articles do not contain working versions of the theorems used in some field of study and you essentially need to find some expert in the field to tell you things like that. So in software project, that would mean, you have to find someone who allegedly made the code base run some time in the past by applying lots of patches which are not in the repository and wich he hopefully is still able to find.

The point I wanted to make so far is: In some respects, this comparison looks pretty bad for math and it becomes surprising that it works in spite of these deficiencies. So the remainder of this post is about the things on the upside, that make math check out almost all the time.

Math spent person-centuries on designing its datatypes

This might be exaggerated, but it is probably not that far off. When I started studying math, one of my lecturers said “inventing good definitions is not less important as proving new results”. Today, I could not agree more, immense work went into the definitions in pure math and they allowed me to solve problems I would be too dumb to even think about otherwise. One analogue in programming is finding the ‘correct’ datatypes, which, if achieved can make your algorithms a lot easier. Another analogue is using good libraries.

Math certainly reaps a great benefit from its well-thought-through definitions, but I must also admit, that the comparison is pretty unfair, since pure mathematicians usually take the freedom to chose nice things to reason about. But this is a point to consider when analysing why math still works, even if some of its practices should doom a software project.

I chose to speak about ‘datatypes’ instead of, say ‘interfaces’, since I think that mathematics does not make that much use of polymorhpism like I learned it in school around 2000. Instead, I think, in this respect mathematical practice is more in line with a data-oriented approach (as we saw last week here on this blog), in math, if you want your X to behave like a Y, you usually give a map, that turns your X into a Y, and then you use Y.

All code is reviewed

Obviously like everywhere in science, there is a peer-review processs if you want to publish an article. But there are actually more instances of things that can be called a review of your math research. Possibly surprising to outsiders, mathematicians talk a lot about their ideas to each other and these kind of talks can be even closer to code reviews than the actual peer reviews. This might also be comparable to pair programming. Also, these review processes are used to determine success in math. Or, more to the point, your math only counts if you managed to communicate your ideas successfully and convince your audience that they work.

So having the same processes in software development would mean that you have to explain your code to your customer, which would be a software developer as yourself, and he would pay you for every convincing implementaion idea. While there is a lot of nonsense in that thought, please note that in a world like that, you cannot get payed for a working 300-line block code function that nobody understands. On the other hand, you could get payed for understanding the problem your software is supposed to solve even if your code fails to compile. And in total, the interesting things here for me is, that this shift in incentives and emphasis on practices that force you to understand your code by communicating it to others can save a very large project with some quite bad circumstances.

Getting started with exact arithmetic and F#

In this blog post, I claimed that some exact arithmetic beyond rational numbers can be implemented on a computer. Today I want to show you how that might be done by showing you the beginning of my implementation. I chose F# for the task, since I have been waiting for an opportunity to check it out anyway. So this post is a more practical (first) follow up on the more theoretic one linked above with some of my F# developing experiences on the side.

F# turned out to be mostly pleasant to use, the only annoying thing that happened to me along the way was some weirdness of F# or of the otherwise very helpful IDE Rider: F# seems to need a compilation order of the source code files and I only found out by acts of desperation that this order is supposed to be controlled by drag & drop:

The code I want to (partially) explain is available on github:

https://github.com/felixwellen/ExactArithmetic

I will link to the current commit, when I discuss specifc sections below.

Prerequesite: Rational numbers and Polynomials

As explained in the ‘theory post’, polynomials will be the basic ingredient to cook more exact numbers from the rationals. The rationals themselves can be built from ‘BigInteger’s (source). The basic arithmetic operations follow the rules commonly tought in schools (here is addition):

static member (+) (l: Rational, r: Rational) =
    Rational(l.up * r.down + r.up * l.down,
             l.down * r.down)

‘up’ and ‘down’ are ‘BigInteger’s representing the nominator and denominator of the rational number. ‘-‘, ‘*’ and ‘/’ are defined in the same style and extended to polynomials with rational coefficients (source).

There are two things important for this post, that polynomials have and rationals do not have: Degrees and remainders. The degree of a polynomial is just the number of its coefficients minus one, unless it is constant zero. The zero-polynomial has degree -1 in my code, but that specific value is not too important – it just needs to be smaller than all the other degrees.

Remainders are a bit more work to calculate. For two polynomials P and Q where Q is not zero, there is always a unique polynomial R that has a smaller degree such that:

P = Q * D + R

For some polynomial D (the algorithm is here).

Numberfields and examples

The ingredients are put together in the type ‘NumberField’ which is the name used in algebra, so it is precisely what is described here. Yet it is far from obvious that this is the ‘same’ things as in my example code.

One source of confusion of this approach to exact arithmetic is that we do not know which solution of a polynomial equation we are using. In the example with the square root, the solutions only differ in the sign, but things can get more complicated. This ambiguity is also the reason that you will not find a function in my code, that approximates the elements of a numberfield by a decimal number. In order to do that, we would have to choose a particular solution first.

Now, in the form of unit tests unit tests, we can look at a very basic example of a number field: The one from the theory-post containing a solution of the equation X²=2:

let TwoAsPolynomial = Polynomial([|Rational(2,1)|])
let ModulusForSquareRootOfTwo = 
     Polynomial.Power(Polynomial.X,2) - TwoAsPolynomial
let E = NumberField(ModulusForSquareRootOfTwo)   
let TwoAsNumberFieldElement = NumberFieldElement(E, TwoAsPolynomial)

[<Fact>]
let ``the abstract solution is a solution of the given equation``() =
    let e = E.Solution in  (* e is a solution of the equation 'X^2-2=0' *)
    Assert.Equal(E.Zero, e * e - TwoAsNumberFieldElement)

There are applications of these numbers which have no obvious relation to square roots. For example, there are numberfields containing roots of unity, which would allow us to calculate with rotations in the plane by rational fraction of a full rotation. This might be the topic of a follow up post…

Some strings are more equal before your Oracle database

When working with customer code based on ADO.net, I was surprised by the following error message:

The german message just tells us that some UpdateCommand had an effect on “0” instead of the expected “1” rows of a DataTable. This happened on writing some changes to a table using an OracleDataAdapter. What really surprised me at this point was that there certainly was no other thread writing to the database during my update attempt. Even more confusing was, that my method of changing DataTables and using the OracleDataAdapter to write changes had worked pretty well so far.

In this case, the title “DBConcurrencyExceptionturned out to be quite misleading. The text message was absolutely correct, though.

The explanation

The UpdateCommand is a prepared statement generated by the OracleDataAdapter. It may be used to write the changes a DataTable keeps track of to a database. To update a row, the UpdateCommand identifies the row with a WHERE-clause that matches all original values of the row and writes the updates to the row. So if we have a table with two rows, a primary id and a number, the update statement would essentially look like this:

UPDATE EXAMPLE_TABLE
  SET ROW_ID =:current_ROW_ID, 
      NUMBER_COLUMN =:current_NUMBER_COLUMN
WHERE
      ROW_ID =:old_ROW_ID 
  AND NUMBER_COLUMN =:old_NUMBER_COLUMN

In my case, the problem turned out to be caused by string-valued columns and was due to some oracle-weirdness that was already discussed on this blog (https://schneide.blog/2010/07/12/an-oracle-story-null-empty-or-what/): On writing, empty strings (more precisely: empty VARCHAR2s) are transformed to a DBNull. Note however, that the following are not equivalent:

WHERE TEXT_COLUMN = ''
WHERE TEXT_COLUMN is null

The first will just never match… (at least with Oracle 11g). So saying that null and empty strings are the same would not be an accurate description.

The WHERE-clause of the generated UpdateCommands look more complicated for (nullable) columns of type VARCHAR2. But instead of trying to understand the generated code, I just guessed that the problem was a bug or inconsistency in the OracleDataAdapter that caused the exception. And in fact, it turned out that the problem occured whenever I tried to write an empty string to a column that was DBNull before. Which would explain the message of the DBConcurrencyException, since the DataTable thinks there is a difference between empty strings and DBNulls but due to the conversion there will be no difference when the corrensponding row is updated. So once understood, the problem was easily fixed by transforming all empty strings to null prior to invoking the UpdateCommand.

Dragging DataGridRows in WPF

The Windows Presentation Foundation (WPF) is a framework for graphical user interfaces. It has a powerful component called DataGrid, which is pretty useful for letting the user interact with data loaded from a database:

The design is flexible and the cells of this grid can be filled with other elements like ComboBoxes, Buttons or Images in a straight forward way. Here is an example picture with a ComboBox:

This post is about handling Drag & Drop for the rows of a DataGrid with occasional ComboBox-columns. The approach presented here will not work if all the columns are ComboBoxes, since the rows will be draggable only in all non-ComboBox-columns.

The problem I had with the drag and drop solutions I found on the web was that they prevented the ComboBoxes from reacting correctly to mouse events. So the limited but quite simple solution I settled for lets the cells handle the mouse events.

Accourding to the documentation the first thing you have to do is:

Identify the start of a drag event

As mentioned above among the various choices which event of which element should be used to decide that a drag started, the MouseMove of the text-cells of my DataGrid turned out to be the best choice by far:

In the so called code-behind, the obvious move is to say, that if the user has the left button pressed while moving the mouse, she wants to start a drag:

And the DataRowView that is being dragged may be found with the following method:

Dropping

The “Drop” part is about as straight forward as I expected:

And the handler can extract the row we gave above from the event arguments:

Compiling and using Agda through the Windows Linux Subsystem

The Windows Subsystem for Linux (WSL) more or less runs a linux kernel on Windows 10. In this post, I will describe how to use WSL to compile and run Agda, a dependently typed functional programming language. Compiling agda yourself makes sense if you want to use the latest features, of which there are quite nice ones. The approach presented here is just my preferred way of compiling and using Agda on a Linux system with some minor adjustments.

Prerequisits for compiling

First, you need the “ubuntu app”, you can install it following this guide. Essentially, you just have to activate WSL and install the app through the Microsoft Store, but following the guide step by step allowed me to do it without creating a Microsoft account.
After installing your ubuntu app will ask you to create a new account and it will probably need some updating, which you can do by running:

sudo apt-get update

A usability hint: You can copy-paste to the ubuntu app by copying with CTRL-C and right-clicking into the ubuntu-window. You have to make the ubuntu-window active before the right-click. You can copy stuff in the ubuntu-window by marking and pressing CTRL-C.

There are two tools that can get the dependencies and compile Agda for you, “cabal” and “stack”. I prefer to use stack:

https://docs.haskellstack.org/en/stable/README/

After installing, stack asked me to append something to my PATH, which I did only for the session:

export PATH=$PATH:/home/USER/.local/bin

Getting the sources and compiling

Git is preinstalled, so you can just get the agda sources with:

git clone https://github.com/agda/agda.git

Go into the agda folder. It will contain a couple of files with names like

stack-8.8.1.yaml

These are configuration files for stack. The numbers indicate the version of ghc (the haskell compiler) that will be used. I take always the newest version (if everything works with it…) and make a copy

cp stack-8.8.1.yaml stack.yaml

– since stack will just use the “stack.yaml” for configuration when run in a folder. Now:

stack setup

will download ghc binaries and install them somewhere below “HOME/.stack/”. Before building, we have to install a dependency (otherwise there will be a linker error with the current ubuntu app):

sudo apt install libtinfo-dev

Then tell stack to build and hope for the best (that took around 5.2GB of RAM and half an hour on my system…):

stack build

On success, it should look like this:
agda_folders

If you are not confident with finding the locations from the last lines again, you should secure the path from the last lines. We will need “agda” and “agda-mode”, which are in the same folder.

Using Agda

Of course, you can use agda from the command line, but it is a lot more fun to use from emacs (or, possibly atom, which I have not checked out). Since the ubuntu app does not come with a window system and on the other hand our freshly built agda cannot be invoked easily from windows programs, I found it most convenient to run emacs in the ubuntu app, but use an x-server in windows.

For the latter, you can install Xming and start it. Then install emacs in the ubuntu app:

sudo apt install emacs

Before starting emacs, we should install the “agda-mode” for emacs. This can be done by

./stack/install/[LONG PATH FROM ABOVE]/bin/agda-mode setup

Now run emacs with the variables “DISPLAY” set to something which connects it to Xming and “PATH” appended by the long thing from above, so emacs can find agda (and agda-mode):

PATH=$PATH:~/agda/.stack-work/[LONG PATH FROM ABOVE]/bin/ DISPLAY=:0 emacs

Then everything should work and you can test the agda-mode, for example with a new file containing the following:

agda-mode-test

CTRL-C, CTRL-L tells agda-mode to check and highlight the contents of the file.  Here is more on using the agda-mode. Have fun!

Sources:

Resources should follow responsibilities

In the german military forces, there is a new idea coming into effect: Give your commanders the ability to spend free expenses (linked article is in german language). Like, if your battailon lacks sunglasses, you don’t have to wait for bureaucracy to procure them for you (and it probably takes longer than the sun is your immediate problem), you can go out and just buy them.

This is not a new idea, and not a bad one. It implements a simple principle: Resources follow responsibilities. If you have goals to reach, decisions to make and people to manage, you need proper resources. And by resources, I don’t only mean money. Some leniency in procedures, maneuvering space (both real and figuratively) and time are resources that can’t be bought with money, but are essential sometimes.

At our company, we installed this principle over a decade ago. The “creativity budget” is a budget of free expenses for each employee to improve their particular working situation. This might mean a new computer mouse, a conference visit or a specific software. You, the employee, are at the frontline of your work and probably knows best what’s needed. Our creativity budget is the means to obtain it, no questions asked.

And this shows the underlying core principle: Responsibilities follow trust. If I trust you to reach the goals, to make the right decisions and to manage your team, it would be inconsequential to not give you the proper responsibilities. And, transitively, to provide you with adequate resources. As it seems, responsibilities are the middle man between trust and resources.

At our company, you don’t need to invest your free expenses for basic work attire. We are software engineers, so “work attire” means a high-class computer (with several monitors, currently our default is three), a powerful notebook, a decent smartphone and all the non-technical stuff that will determine your long-term work output, like a fitting desk and your personal, comfortable work chair.

For me, it was always consequential that great results can only come from the combination of a great developer and great equipment. I cannot understand how it is expected from developers to produce top-notch software on mediocre or even subpar computers and tools. In my opinion, these things strongly relate with each other. Give your developers good equipment and good results will follow. Putting it in a simplistic formula: the ability of the developer multiplied with the power of the equipment makes the quality of the software result.

So, if I trust your ability as a developer, provide you with premium equipment, give you room to maneuver and resources to cover your individual requirements, there should be nothing in the way to hold you back. And that places another responsibility onto your shoulder: You are responsible for your work results. And you deserve all the praise for the better ones. Because without you, the able developer, all the prerequisites listed above would still yield to nothing.

Gradle projects as Debian packages

Gradle is a great tool for setting up and building your Java projects. If you want to deliver them for Ubuntu or other debian-based distributions you should consider building .deb packages. Because of the quite steep learning curve of debian packaging I want to show you a step-by-step guide to get you up to speed.

Prerequisites

You have a project that can be built by gradle using gradle wrapper. In addition you have a debian-based system where you can install and use the packaging utilities used to create the package metadata and the final packages.

To prepare the debian system you have to install some packages:

sudo apt install dh-make debhelper javahelper

Generating packaging infrastructure

First we have to generate all the files necessary to build full fledged debian packages. Fortunately, there is a tool for that called dh_make. To correctly prefill the maintainer name and e-mail address we have to set 2 environment variables. Of course, you could change them later…

export DEBFULLNAME="John Doe"
export DEBEMAIL="john.doe@company.net"
cd $project_root
dh_make --native -p $project_name-$version

Choose “indep binary” (“i”) as type of package because Java is architecture indendepent. This will generate the debian directory containing all the files for creating .deb packages. You can safely ignore all of the files ending with .ex as they are examples features like manpage-generation, additional scripts pre- and post-installation and many other aspects.

We will concentrate on only two files that will allow us to build a nice basic package of our software:

  1. control
  2. rules

Adding metadata for our Java project

In the control file fill all the properties if relevant for your project. They will help your users understand what the package contains and whom to contact in case of problems. You should add the JRE to depends, e.g.:

Depends: openjdk-8-jre, ${misc:Depends}

If you have other dependencies that can be resolved by packages of the distribution add them there, too.

Define the rules for building our Java project

The most important file is the rules makefile which defines how our project is built and what the resulting package contents consist of. For this to work with gradle we use the javahelper dh_make extension and override some targets to tune the results. Key in all this is that the directory debian/$project_name/ contains a directory structure with all our files we want to install on the target machine. In our example we will put everything into the directory /opt/my_project.

#!/usr/bin/make -f
# -*- makefile -*-

# Uncomment this to turn on verbose mode.
#export DH_VERBOSE=1

%:
	dh $@ --with javahelper # use the javahelper extension

override_dh_auto_build:
	export GRADLE_USER_HOME="`pwd`/gradle"; \
	export GRADLE_OPTS="-Dorg.gradle.daemon=false -Xmx512m"; \
	./gradlew assemble; \
	./gradlew test

override_dh_auto_install:
	dh_auto_install
# here we can install additional files like an upstart configuration
	export UPSTART_TARGET_DIR=debian/my_project/etc/init/; \
	mkdir -p $${UPSTART_TARGET_DIR}; \
	install -m 644 debian/my_project.conf $${UPSTART_TARGET_DIR};

# additional install target of javahelper
override_jh_installlibs:
	LIB_DIR="debian/my_project/opt/my_project/lib"; \
	mkdir -p $${LIB_DIR}; \
	install lib/*.jar $${LIB_DIR}; \
	install build/libs/*.jar $${LIB_DIR};
	BIN_DIR="debian/my_project/opt/my_project/bin"; \
	mkdir -p $${BIN_DIR}; \
	install build/scripts/my_project_start_script.sh $${BIN_DIR}; \

Most of the above should be self-explanatory. Here some things that cost me some time and I found noteworthy:

  • Newer Gradle version use a lot memory and try to start a daemon which does not help you on your build slaves (if using a continous integration system)
  • The rules file is in GNU make syntax and executes each command separately. So you have to make sure everything is on “one line” if you want to access environment variables for example. This is achieved by \ as continuation character.
  • You have to escape the $ to use shell variables.

Summary

Debian packaging can be daunting at first but using and understanding the tools you can build new packages of your projects in a few minutes. I hope this guide helps you to find a starting point for your gradle-based projects.