Zero Interaction Tools

Some time ago, a customer called us for a delicate task: To develop a little tool in a very tight budget that aggregates pictures in a specific way. The pictures were from the medical domain and comprised of sets with thousands of pictures that needed to be combined into one large picture with the help of some mathematics. Developing the algorithm was not a problem, the huge data sizes neither, but the budget was a challenge. We could program everything and test it on some sample picture sets (that arrived on several blue-ray discs) within the budget, but an elaborate graphical user interface (GUI) would be out of scope. On the other hand, the anticipated users of the tool weren’t computer affine enough to handle a CLI (command line interface). The tool needed to be simple, fast and cheap. And it only needed to do one thing.

Traditional usage of software tools

In the traditional way, a software tool comes with an installer that entrenches the tool onto the target computer and provides a start menu entry, a desktop icon and perhaps even a boot launcher with a fancy tray icon and some balloon tooltips that inform the user from time to time that the tool is still installed and wants some attention. If you click on the tool’s icon, a graphical user interface appears, perhaps in the form of an application window or just a tray pop-up menu. You need to interact with the user interface, then. Let’s say you want to combine several thousand pictures into one, then you need to specify directories or collections of files through some file dialogs with “browse” buttons and complex “ingredients” lists. You’ve probably seen this type of user interface while burning a blue-ray disc or uploading files into the cloud. After you’ve selected all your input pictures, you have to say where to write to (another file dialog) and what name your target file should have. After that, you’ll stare at a progress bar and wait for it to reach the right hand side of the widget. And then, the tool will beep and proudly present a little message box that informs you that everything has worked out just fine and you can find your result right were you wanted to. Afterwards, the tool will sit there on your screen in anticipation of your next move. What will you do? Do it all again because you love the progress bars and beeps? Combine another several thousand pictures? Shutdown the tool? Oh, come on! Are you sure you want to quit now?

None of this could be developed in the budget our customer gave us. The tool didn’t need the self-marketing aspects like a tray icon or launcher, because the customer would only use it internally. But even the rest of the user interface was too much work: the future users would not get a traditional software tool.

Zero interaction tool

So we thought about the minimal user interface that the picture aggregation tool needed to have. And came to the conclusion that no user interface was needed, because it really only needed to do one thing. At least, if certain assumptions hold true:

  • The tool is fast enough to produce no significant delay
  • The input directory holds all pictures that should be aggregated
  • The input directory can be the output directory as well
  • The name of the resulting picture file can contain a timestamp to distinguish between several tool runs

We consulted our customer and checked that the latter three assumptions were valid. So, given we can make the first assumption a reality, the tool could work without any form of user interaction by being copied into the picture directory and then started.

Yes, you’ve read this right. The tool would not be installed, but copied for every usage. This is a highly unusual usage scenario for a program, because it means that every picture directory that should be aggregated holds an identical copy of the program. But if we can make some more assumptions valid, it is a viable way to empower the users:

  • The tool must run on all target machines without additional preparation
  • The tool must only consist of one executable file, no DLLs or configuration files
  • The tool must be small in size, like one megabyte at most

We confirmed with a quick and dirty spike (an embarrasingly inchoate prototype) that we can produce a program that conforms to all three new assumptions/requirements. The only remaining problem was the very first assumption: No harddrive was fast enough to provide the pixel data of thousands of pictures in less than a second. Even if we could aggregate the pixels fast enough (given enough cores, this would be possible), we couldn’t get hold of them fast enough. We needed some kind of progress bar.

Use your information channels

We thought about the information channels our tool would have towards the user. Let’s repeat the scenario: The user navigates to the directory containing the pictures that should be aggregated, copies the executable program into it and double-clicks to start the tool. There are many possibilities to inform the user about progress:

  • Audio (Sound): We can play a little tune or some sound that changes frequency to indicate progress. This is highly unusual and we can’t be sure that the speakers aren’t muted (usage on a notebook was part of the domain analysis results). No sounds, that is.
  • Animation (Graphics): In the most boring case, this would be a little window with a progress bar that runs from left to right and disappears when the work is done. Possible, but boring. Perhaps we can think of something more in tune with the rest of the usage scenario.
  • Text: Well, this was the idea. We produce a result file and give it a name. We don’t need to keep the name static as long as we are working and things change inside the file, anyways. We just update the file name to reflect our progress.

So our tool creates a new result file in the picture directory that is named result_0_percent or something and runs up to result_100_percent and then gets renamed to result_timestamp with the current timestamp. You can just watch your file explorer to keep up with the tool’s completion. This is a bit unusual at first, but all pilot users grasped the concept immediately and were pleased with it.

The result

And this is the story when we developed a highly specialized tool within a very small budget without any graphical or otherwise traditional user interface. The user brings the tool to the data (by copying it into the same directory) and lets it perform its work by simply starting it. The tool reports its progress back via the result file name. As soon as the result file contains a timestamp (and the notebook air fans cease to go beserk), the user can copy it into the next tool in the tool chain, probably a picture viewer or a printer driver. The users loved the tool for its speed and simplicity.

One funny side-note remains to be told: Because thousands of pictures aggregated into one produces a picture with a lot of details, the result file was not too big (about 20-30 megabytes), but could take out any printer for several hours if printed. The tool got informally renamed to “printer-reaper.exe”.

My favorite Unix tool

Awk is a little language designed for the processing of lines of text. It is available on every Unix (since V3) or Linux system. The name is an acronym of the names of its creators: Aho, Weinberger and Kernighan.

Since I spent a couple of minutes to learn awk I have found it quite useful during my daily work. It is my favorite tool in the base set of Unix tools due to its simplicity and versatility.

Typical use cases for awk scripts are log file analysis and the processing of character separated value (CSV) formats. Awk allows you to easily filter, transform and aggregate lines of text.

The idea of awk is very simple. An awk script consists of a number of patterns, each associated with a block of code that gets executed for an input line if the pattern matches:

pattern_1 {
    # code to execute if pattern matches line
}

pattern_2 {
    # code to execute if pattern matches line
}

# ...

pattern_n {
    # code to execute if pattern matches line
}

Patterns and blocks

The patterns are usually regular expressions:

/error|warning/ {
    # executed for each line, which contains
    # the word "error" or "warning"
}

/^Exception/ {
    # executed for each line starting
    # with "Exception"
}

There are some special patterns, namely the empty pattern, which matches every line …

{
    # executed for every line
}

… and the BEGIN and END patterns. Their blocks are executed before and after the processing of the input, respectively:

BEGIN {
    # executed before any input is processed,
    # often used to initialize variables
}

END {
    # executed after all input has been processed,
    # often used to output an aggregation of
    # collected values or a summary
}

Output and variables

The most common operation within a block is the print statement. The following awk script outputs each line containing the string “error”:

/error/ { print }

This is basically the functionality of the Unix grep command, which is filtering. It gets more interesting with variables. Awk provides a couple of useful built-in variables. Here are some of them:

  • $0 represents the entire current line
  • $1$n represent the 1…n-th field of the current line
  • NF holds the number of fields in the current line
  • NR holds the number of the current line (“record”)

By default awk interprets whitespace sequences (spaces and tabs) as field separators. However, this can be changed by setting the FS variable (“field separator”).

The following script outputs the second field for each line:

{ print $2 }

Input:

John 32 male
Jane 45 female
Richard 73 male

Output:

32
45
73

And this script calculates the sum and the average of the second fields:

{
    sum += $2
}

END {
    print "sum: " sum ", average: " sum/NR
}

Output:

sum: 150, average: 50

The language

The language that can be used within a block of code is based on C syntax without types and is very similar to JavaScript. All the familiar control structures like if/else, for, while, do and operators like =, ==, >, &&, ||, ++, +=, … are there.

Semicolons at the end of statements are optional, like in JavaScript. Comments start with a #, not with //.

Variables do not have to be declared before usage (no ‘var’ or type). You can simply assign a value to a variable and it comes into existence.

String concatenation does not have an explicit operator like “+”. Strings and variables are concatenated by placing them next to each other:

"Hello " name ", how are you?"
# This is wrong: "Hello" + name + ", how are you?"

print is a statement, not a function. Parentheses around its parameter list are optional.

Functions

Awk provides a small set of built-in functions. Some of them are:

length(string), substr(string, index, count), index(string, substring), tolower(string), toupper(string), match(string, regexp).

User-defined functions look like JavaScript functions:

function min(number1, number2) {
    if (number1 < number2) {
        return number1
    }
    return number2
}

In fact, JavaScript adopted the function keyword from awk. User-defined functions can be placed outside of pattern blocks.

Command-line invocation

An awk script can be either read from a script file with the -f option:

$ awk -f myscript.awk data.txt

… or it can be supplied in-line within single quotes:

$ awk '{sum+=$2} END {print "sum: " sum " avg: " sum/NR}' data.txt

Conclusion

I hope this short introduction helped you add awk to your toolbox if you weren’t familiar with awk yet. Awk is a neat alternative to full-blown scripting languages like Python and Perl for simple text processing tasks.