A minimal set of skills for software development contractors

You aren’t sure if your developer is professional enough? Here are seven topics you can ask him about to find it out. It’s the minimal skill set a modern developer should use.

“Our company is specialized in providing professional software development for our customers”. That’s a nice statement to inspire your customers with. The only problem with it is: every contractor claims to be professional. You wouldn’t even get a project if you admitted to be “unprofessional”. But how can a customer, mostly unaware of the subtleties in the field of software development, decide if his contractor really works professionally? A lot of money currently spent on projects doomed from the beginning could be saved if the answer was that easy. But there’s a lower limit of skills that have to be present to pass the most minimal litmus test on developer professionality. This blog article gives you an overview about the things you should ask from your next software development contractor.

First a disclaimer: I’ve compiled this list of skills with the best intentions. It is definitely possible to develop software without some or even any of these skills. The development can even be performed in a very professional manner. So the absence of a skill doesn’t reveal an unprofessional contractor without fail. And on the other side, the clear presence of all skills doesn’t lead to glorious projects. The list is a rule of thumb to distinguish the “better” contractor from the “worse”. It’s a starting ground for the inexperienced customer to ask the right questions and get hopefully insightful answers.

Let’s assume you are a customer on the lookout for a suitable software development contractor, maybe a freelancer or a company. You might take this list and just ask your potential developer about every item on it. Listen to their answers and let them show you their implementation of the skill. In my opinion, the last point is the most crucial one: Don’t just talk about it, let them demonstrate their abilities. You won’t be able to differentiate the best from the most trivial implementation at first, but that’s part of the learning process. The thing is: if the developer can readily demonstrate something, chances are he really knows what he is talking about.

The minimal skills

The list is sorted by their direct impact on the overall development quality. This includes the quality perceived by you (the customer), the end user and the next developer who inherits the source code once the original developer bails out. This doesn’t mean that the topics mentioned later are “optional” in the long run.

Source code management system

This tool has many different names: source code management (SCM), revision control system (RCS) and version control system (VCS) are just a few of them. It is used to track the changes in the code over time. With this tool, the developer is able to tell you exactly which change happened when, for what version and by whom. It is even possible to undo the change later on. If your developer mentions specific tool names like Git, Subversion, Perforce or Mercurial, you are mostly settled here. Let him show you a typical sync-edit-commit cycle and try to comprehend what he’s telling you. Most developers love to brag about their sophisticated use of version control abilities.

Issue tracking

An issue or bug tracker is a tool that stores all inquiries, bug reports, wishes and complaints you make. You can compare it to a helpdesk “trouble ticket” system. The issue tracker provides a todo list for the developer and acts as an impartial documentation of your communication with the developer. If you can’t get direct access to the issue tracker on their website, let them demonstrate the usage by playing through a typical scenario like a bug report. At least, the developer should provide you with a list of “resolved” issues for each new version of your software.

Continuous integration

This is a relatively new type of tool, but a very powerful one. It can also be named a “build server” or (less powerful) a “nightly build”. The baseline is that your project will be built by an automated process, as often as possible. In the case of continuous integration, the build happens after each commit to the source code management system (refer to the first entry of this list). Let your developer show you what happens automatically after a commit to the source code management system. Ask him about the “build time” of your project (or other projects). This is the time needed to produce a new version you can try out. If the build time is reasonably low (like a few minutes), ask for a small change to your project and wait for the resulting software.

There is a fair chance that your developer not only talks about “continuous integration”, but also “continuous delivery”. This includes words like “staging”, “build queue”, “test installation”, etc. Great! Let them explain and demonstrate their implementation of “continuous delivery”. You’ll probably be impressed and the developer had another chance to brag.

Verification (a.k.a. Testing)

This is a delicate question: “Will the source code contain automated tests?”. Our industry’s expectancy value for any kind of automated tests in a project is still dangerously near absolute zero. If you get blank stares on that question, that’s not a good sign. It doesn’t really matter too much if the answer contains the words “unit test”, “integration test” or even “acceptance test”. Most important again: Let your developer show you their implementation of automated tests in your (or a similar) project. Make sure the continuous integration server (refer to entry number three) is aware of the tests and runs them on every build. This way, everything that’s secured by tests cannot break without being noticed immediately. You probably won’t have to deal with reappearing bugs in every other version, a symptom known as “regression”.

Your developer might be really enthusiastic about testing. While every developer hour costs your precious money, this is money well spent. Think of it as an insurance against unpredictable behaviour of your software in the future. Over the course of development, you won’t notice these tests directly, as they are used internally for development. Talk to your developer about some form of reporting on the tests. Perhaps a “test coverage” report that accompanies the issue list (refer to the second entry)? Just don’t go overboard here. A low test coverage percentage is still better than no tests.

If your developer states that he is “test driven”, that’s not a psychological condition, but a modern attempt to test really thoroughly. Let him demonstrate you the advantages of this approach by playing through an implementation cycle of a small change to your project. It may foster your confidence in the insurance’s power.

Project documentation

Every software project above the trivial level contains so many details that no human brain is able to remember them all after some time. Your developer needs some place to store vital information about the project other than “in the code” and “in the issue tracker”. A popular choice to implement this requirement is providing a Wiki. You probably already know a Wiki from Wikipedia. Think about a web-based text editing tool with structuring possibilities. If you can’t access the documentation tool yourself, let your developer demonstrate it. Ask about an excerpt of your project documentation, perhaps as a PDF or HTML document. Don’t be too picky about the aesthetics, the main use case is quick and easy information retrieval. Even handwritten project documentation may pass your test, as long as it is stored in one central place.

Source code conventions

Nearly all source code is readable by a machine. But some source code is totally illegible by fellow developers or even the original author. Ask your developer about their code formatting rules. Hopefully, he can provide you with some written rules that are really applied to the code. For most programming languages, there are tools that can check the formatting against certain rules. These programs are called “code inspection tools” and fit like hand in glove with the continuous integration server (refer to the third entry). Some aspects of source code readability cannot be checked by algorithms, like naming or clarity of concepts. Good developers perform regular code reviews where fellow developers discuss the code critically and suggest improvements. The best customers explicitely ask for code reviews, even if they won’t participate in them. You will feel the difference in the produced software on the long run.

Community awareness

Software development is a rapidly advancing profession, with game-changing discoveries every other year. One single developer cannot track all the new tools, concepts and possibilities in his field. He has to rely on a community of like-minded and well-meaning experts that share their knowledge. Ask your developer about his community. What (technical) books did he read recently? What books are known by the whole development team? As a customer, you probably can’t tell right away if the books are worth their paper, but that’s not the main point of the question. Just like with tests, the amount of books read by the average programmer won’t make a very long list. If your development team is consistent enough to share a common literature ground, that’s already worth a lot.

But it’s not just books. Even books are too slow for the advancement! Ask about participation in local technical events, like user groups of the programming language of your project. What about sharing? Does the developer share his experiences and insights? The cheapest way to do that is a weblog (you’re reading one right now). Let him show you his blog. How many articles are published in a reasonable timespan, what’s the feedback? Perhaps he writes articles for a technical magazine or even a book? Now you can ask other developers for their opinion on the published work. You’ve probably found a really professional developer, congratulations.

There is more, much more

This list is in no way exhaustive in regard to what a capable developer uses in concepts, skills and tools. This is meant as the minimal set, with a lot of room for improvement. There are compilations of skills like the Clean Code Developer that go way beyond this list. Ask your developer about his personal field of interest. Hopefully, after he finished bragging and techno-babbling for some time, you’re convinced that your developer is a professional one. You shouldn’t settle for less.

Grails and the query cache

The principle of least astonishment can be violated in the unusual places like using the query cache on a Grails domain class.

Look at the following code:

class Node {
  Node parent
  String name
  Tree tree
}

Tree tree = new Tree()
Node root = new Node(name: 'Root', tree: tree)
root.save()
new Node(name: 'Child', parent: root, tree: tree).save()

What happens when I query all nodes by tree?

List allNodesOfTree = Node.findAllByTree(tree, [cache: true])

Of course you get 2 nodes, but what is the result of:

allNodesOfTree.contains(Node.get(rootId))

It should be true but it isn’t all the time. If you didn’t implement equals and hashCode you get an instance equals that is the same as ==.
Hibernate guarantees that you get the same instance out of a session for the same domain object. (Node.get(rootId) == Node.get(rootId))

But the query cache plays a crucial role here, it saves the ids of the result and calls Node.load(id). There is an important difference between Node.get and Node.load. Node.get always returns an instance of Node which is a real node not a proxy. For this it queries the session context and hits the database when necessary. Node.load on the other hand never hits the database. It returns a proxy and only when the session contains the domain object it returns a real domain object.

So allNodesOfTree returns

  • two proxies when no element is in the session
  • a proxy and a real object when you call Node.get(childId) beforehand
  • two real objects when you call get on both elements first

Deactivating the query cache globally or for this query only, returns two real objects.

Testing C programs using GLib

Writing programs in good old C can be quite refreshing if you use some modern utility library like GLib. It offers a comprehensive set of tools you expect from a modern programming environment like collections, logging, plugin support, thread abstractions, string and date utilities, different parsers, i18n and a lot more. One essential part, especially for agile teams, is onboard too: the unit test framework gtest.

Because of the statically compiled nature of C testing involves a bit more work than in Java or modern scripting environments. Usually you have to perform these steps:

  1. Write a main program for running the tests. Here you initialize the framework, register the test functions and execute the tests. You may want to build different test programs for larger projects.
  2. Add the test executable to your build system, so that you can compile, link and run it automatically.
  3. Execute the gtester test runner to generate the test results and eventually a XML-file to you in your continuous integration (CI) infrastructure. You may need to convert the XML ouput if you are using Jenkins for example.

A basic test looks quite simple, see the code below:

#include <glib.h>
#include "computations.h"

void computationTest(void)
{
    g_assert_cmpint(1234, ==, compute(1, 1));
}

int main(int argc, char** argv)
{
    g_test_init(&argc, &argv, NULL);
    g_test_add_func("/package_name/unit", computationTest);
    return g_test_run();
}

To run the test and produce the xml-output you simply execute the test runner gtester like so:

gtester build_dir/computation_tests --keep-going -o=testresults.xml

GTester unfortunately produces a result file which is incompatible with Jenkins’ test result reporting. Fortunately R. Tyler Croy has put together an XSL script that you can use to convert the results using

xsltproc -o junit-testresults.xml tools/gtester.xsl testresults.xml

That way you get relatively easy to use unit tests working on your code and nice some CI integration for your modern C language projects.

Update:

Recent gtester run the test binary multiple times if there are failing tests. To get a report of all (passing and failing) tests you may want to use my modified gtester.xsl script.

Testing antipatterns

Some testing anti patterns found in everyday code.

Catch all

try {
  callFailingMethod()
  fail()
} catch (Exception e) {
}

Problems:
When you look at the test code you cannot see which type of exception is thrown. First it is better for clarity to document which type is thrown and second any bugs in the called code who throw unintended exceptions are swallowed here.

Better:

try {
  callFailingMethod()
  fail()
} catch (ParseException e) {
}

Problems:
If it fails you don’t see why: so always use a message for fail.

Better:

try {
  callFailingMethod()
  fail('ParseException expected')
} catch (ParseException e) {
}

Problems:
If an exception is thrown, you don’t assert that it is the expected exception, so test for the exception message.

Solution:

try {
  callFailingMethod()
  fail('ParseException expected')
} catch (ParseException e) {
  assertEquals("Invalid character at line 2", e.getMessage())
}

Using assert

assert isOdd(3)

Problems:
If you do not enable assertions on the JVM (by passing -ea) this line does nothing and the test passes fine every time.

Better:

assertTrue(isOdd(3))

Problems:
If assertTrue or assertFalse fails, you just get a generic error message, better use a message which communicates the error/

Solution:

assertTrue("3 should be odd", isOdd(3))

AssertTrue instead of assertEquals

  assertTrue('Expected: 1+2 = 3', sum(1, 2) == 3)

Problems:
You don’t see the actual value here, you could include it in the message, but there is an assertion for that: assertEquals

Solution:

  assertEquals(3, sum(1, 2))

Conditional logic in tests

if (isOdd(value)) {
  assertEquals(5, calculate(value)) 
} else {
  assertEquals(6, calculate(value)) 
}

Problems:
Can you look at the test source code and tell me which branch is used? If only one is used all the time, erase the other. If both are used, first make the test deterministic and use two tests, one for each branch.

Building Windows C++ Projects with CMake and Jenkins

An short and easy way to build Windows C++ Software with CMake in Jenkins (with the restriction to support Visual Studio 8).

The C++ programming environment where I feel most comfortable is GCC/Linux (lately with some clang here and there). In terms of build systems I use cmake whenever possible. This environment also makes it easy to use Jenkins as CI server and RPM for deployment and distribution tasks.

So when presented with the task to set up a C++ windows project in Jenkins I tried to do it the same way as much as possible.

The Goal:

A Jenkins job should be set up that builds a windows c++ project on a Windows 7 build slave. For reasons that I will not get into here, compatibility with Visual Studio 8 is required.

The first step was to download and install the correct Windows SDK. This provides all that is needed to build C++ stuff under windows.

Then, after installation of cmake, the first naive try looked like this (in an “execute Windows Batch file” build step)

cmake . -DCMAKE_BUILD_TYPE=Release

This cannot work of course, because cmake will not find compilers and stuff.

Problem: Build Environment

When I do cmake builds manually, i.e. not in Jenkins, I open the Visual Studio 2005 Command Prompt which is a normal windows command shell with all environment variables set. So I tried to do that in Jenkins, too:

call “c:\Program Files\Microsoft SDKs\Windows\v6.0\Bin\SetEnv.Cmd” /Release /x86

cmake . -DCMAKE_BUILD_TYPE=Release

This also did not work and even worse, produced strange (to me, at least) error messages like:

‘Cmd’ is not recognized as an internal or external command, operable program or batch file.

The system cannot find the batch label specified – Set_x86

After some digging, I found the solution: a feature of windows batch programming called delayed expansion, which has to be enabled for SetEnv.Cmd to work correctly.

Solution: SetEnv.cmd and delayed expansion

setlocal enabledelayedexpansion

call “c:\Program Files\Microsoft SDKs\Windows\v6.0\Bin\SetEnv.Cmd” /Release /x86

cmake . -DCMAKE_BUILD_TYPE=Release

nmake

Yes! With this little trick it worked perfectly. And feels almost as with GCC/CMake under Linux:  nice, short and easy.

Better diagnostics in TDD

Automated tests gain more and more popularity in our field and our company. Avoiding being slowed down by tests becomes crucial. Steve Freeman has a nice talk on infoq.com with many advices for maintaining the benefits of automated testing without producing too much drag. One seldomly discussed topic is test diagnostics and immediately caught our attention. In short your aim is to produce as meaningful messages as possible for failing tests. This leads to the extended TDD cycle depicted below.

There are several techniques to improve the diagnostics of failing tests. Here is a short list of the most important ones:

  • Using assertion messages to make clear what exactly failed
  • Using “named objects” where you essentially just override the toString()-method of some type in your tests to provide meaning for the checked value
    Date startDate = new Date(1000L) {
        @Override
        public String toString() {
            return "startDate";
        }
    };
    
    
  • Using “tracer objects” by giving names to mocks/collaborators in the test, e.g. in Mockito-syntax:
    EventManager em1 = mock(EventManager.class, "Gavin");
    EventManager em2 = mock(EventManager.class, "Frank");
    // do something with them
    

Conclusion

By applying the extended TDD-cycle you can drastically reduce guessing of what went wrong and find regressions much faster without using debug messages or the debugger itself.

Why do (different) programming languages matter?

One common saying in software development is: use the best tool for the job. But what is the best tool? I think the best tool is determined by two things: how it fits the problem domain and how it fits your mental model.

One common saying in software development is: use the best tool for the job. But what is the best tool? I think the best tool is determined by two things: how it fits the problem domain and how it fits your mental model. Why your mental model? Just use the best language available! you might think. But as humans we think in languages and even inside these languages everybody has a typical way of expressing himself. Even own words and if they become common we even have a name for it: a dialect. But it is all that you should consider when choosing a programming language? Certainly there are the tools of the trade: the IDE, debugger, profiler, etc. Here is comes down to personal preferences and most of the shortcomings in this field are short term: better tool support is on the way.
There’s another more important aspect though: the community and therefore the mindset which is brought along. The communities form how the languages are used, where the most libraries and frameworks are developed, which problem domains are tackled and what the values are. Values can be testing, elegance, simplicity, robustness, …
Since communities are consisted of individuals, individuals form what the values are. But I think the language designer lays a foundation here: take Ruby for example, Ruby was designed with the intention to make programming fun. This is one of the things that appeals to many developers and the whole community which uses Ruby. Ruby is fun.
These environments spawn amazing things like Rails or more recently RubyMotion Because of the mindset of the community and the foundation inside the language there are these fruits. Last but not least another reason to choose a language is your familiarity with it. You might choose an inferior tool or language because you know it inside out.

Basic Image Processing Tasks with OpenCV

2D detectors and scientific CCD cameras produce many megabytes of image data. Open source library OpenCV is highly recommended as your work horse for all kinds of image processing tasks.

For one of our customers in the scientific domain we do a lot of integration of pieces of hardware into the existing measurement- and control network. A good part of these are 2D detectors and scientific CCD cameras, which have all sorts of interfaces like ethernet, firewire and frame grabber cards. Our task is then to write some glue software that makes the camera available and controllable for the scientists.

One standard requirement for us is to do some basic image processing and analytics. Typically, this entails flipping the image horizontally and/or vertically, rotating the image around some multiple of 90 degrees, and calculcating some statistics like standard deviation.

The starting point there is always some image data in memory that has been acquired from the camera. Most of the time the image data is either gray values (8, or 16 bit), or RGB(A).

As we are generally not falling victim to the NIH syndrom we use open source image processing librarys. The first one we tried was CImg, which is a header-only (!) C++ library for image processing. The header-only part is very cool and handy, since you just have to #include <CImg.h> and you are done. No further dependencies. The immediate downside, of course, is long compile times. We are talking about > 40000 lines of C++ template code!

The bigger issue we had with CImg was that for multi-channel images the memory layout is like this: R1R2R3R4…..G1G2G3G4….B1B2B3B4. And since the images from the camera usually come interlaced like R1G1B1R2G2B2… we always had to do tricks to use CImg on these images correctly. These tricks killed us eventually in terms of performance, since some of these 2D detectors produce lots of megabytes of image data that have to be processed in real time.

So OpenCV. Their headline was already very promising:

OpenCV (Open Source Computer Vision) is a library of programming functions for real time computer vision.

Especially the words “real time” look good in there. But let’s see.

Image data in OpenCV is represented by instances of class cv::Mat, which is, of course, short for Matrix. From the documentation:

The class Mat represents an n-dimensional dense numerical single-channel or multi-channel array. It can be used to store real or complex-valued vectors and matrices, grayscale or color images, voxel volumes, vector fields, point clouds, tensors, histograms.

Our standard requirements stated above can then be implemented like this (gray scale, 8 bit image):

void processGrayScale8bitImage(uint16_t width, uint16_t height,
                               const double& rotationAngle,
                               uint8_t* pixelData)
{
  // create cv::Mat instance
  // pixel data is not copied!
  cv::Mat img(height, width, CV_8UC1, pixelData);

  // flip vertically
  // third parameter of cv::flip is the so-called flip-code
  // flip-code == 0 means vertical flipping
  cv::Mat verticallyFlippedImg(height, width, CV_8UC1);
  cv::flip(img, verticallyFlippedImg, 0);

  // flip horizontally
  // flip-code > 0 means horizontal flipping
  cv::Mat horizontallyFlippedImg(height, width, CV_8UC1);
  cv::flip(img, horizontallyFlippedImg, 1);

  // rotation (a bit trickier)
  // 1. calculate center point
  cv::Point2f center(img.cols/2.0F, img.rows/2.0F);
  // 2. create rotation matrix
  cv::Mat rotationMatrix =
    cv::getRotationMatrix2D(center, rotationAngle, 1.0);
  // 3. create cv::Mat that will hold the rotated image.
  // For some rotationAngles width and height are switched
  cv::Mat rotatedImg;
  if ( (rotationAngle / 90.0) % 2 != 0) {
    // switch width and height for rotations like 90, 270 degrees
    rotatedImg =
      cv::Mat(cv::Size(img.size().height, img.size().width),
              img.type());
  } else {
    rotatedImg =
      cv::Mat(cv::Size(img.size().width, img.size().height),
              img.type());
  }
  // 4. actual rotation
  cv::warpAffine(img, rotatedImg,
                 rotationMatrix, rotatedImg.size());

  // save into TIFF file
  cv::imwrite("myimage.tiff", gray);
}

The cool thing is that almost the same code can be used for our other image types, too. The only difference is the image type for the cv::Mat constructor:


8-bit gray scale: CV_U8C1
16bit gray scale: CV_U16C1
RGB : CV_U8C3
RGBA: CV_U8C4

Additionally, the whole thing is blazingly fast! All performance problems gone. Yay!

Getting basic statistical values is also a breeze:

void calculateStatistics(const cv::Mat& img)
{
  // minimum, maximum, sum
  double min = 0.0;
  double max = 0.0;
  cv::minMaxLoc(img, &min, &max);
  double sum = cv::sum(img)[0];

  // mean and standard deviation
  cv::Scalar cvMean;
  cv::Scalar cvStddev;
  cv::meanStdDev(img, cvMean, cvStddev);
}

All in all, the OpenCV experience was very positive, so far. They even support CMake. Highly recommended!

Your own CI-based RPM build farm, part 3

In my previous post we learned how to build RPM packages of your software for multiple versions of your target distribution(s). Now I want to present a way of automating the build process and building packages on/for all target platforms. You should have a look at the openSUSE build service to see if it already fits your needs. Then you can stop reading here :-).

We needed better control over the platforms and the process, so we setup a build farm based on the Jenkins continuous integration (CI) server ourselves. The big picture consists of the following components:

  • build slaves allowing a jenkins user to do unattended builds of the packages
  • Jenkins continuous integration server using matrix builds with build slaves for each target platform
  • build script orchestrating the build of all our self-maintained packages
  • jenkins job to deploy the packages to our RPM repository

Preparing the build slaves

Standard installations of openSUSE need some minor tweaks so they can be used as Jenkins build slaves doing unattended RPM package builds. Here are the changes we needed to make it work properly:

  1. Add a user account for the builds, e.g. useradd -m -d /home/jenkins jenkins and setup a password with passwd jenkins.
  2. Change sshd configuration to allow password authentication and restart sshd.
  3. We will link the SOURCES and SPECS directories of /usr/src/packages to the working copy of our repository, so we need to delete the existing directories: rm -r /usr/src/packages/SPECS /usr/src/packages/SOURCES /usr/src/packages/RPMS /usr/src/packages/SRPMS.
  4. Allow non-priviledged users to work with /usr/src/packages with chmod -R o+rwx /usr/src/packages.
  5. Copy the ssh public key for our git repository to the build account in ~/.ssh/id_rsa
  6. Test ssh access on the slave as our build user with ssh -v git@repository. With this step we confirm the host authenticity one time so that future public key ssh interactions work unattended!
  7. Configure git identity on the slave with git config --global user.name "jenkins@build###-$$"; git config --global user.email "jenkins@buildfarm.myorg.net".
  8. Add privileges for the build user needed for our build process in /etc/sudoers: jenkins ALL = (root) NOPASSWD:/usr/bin/zypper,/bin/rpm

Configuring the build slaves

Linux build slaves over ssh are quite easily configured using Jenkins’ web interface. We add labels denoting the distribution release and architecture to be easily able to setup our matrix builds. Then we setup our matrix build as a new job with the usual parameters for source code management (in our case git) etc.

Our configuration matrix has the two axes Architecture and OpenSuseRelease and uses the labels of the build slaves. Our only build step here is calling the script orchestrating the build of our rpm packages.

Putting together the build script

Our build script essentially sets up a clean environment, builds package after package installing build prerequisites if needed. We use small utility functions (functions.sh) for building a package, installing packages from repository, installing freshly built packages and removing installed RPM. The script contains roughly the following phases:

  1. Figure out some quirks about the environment, e.g. openSUSE release number or architecture to build.
  2. Clean the environment by removing previously installed self-built packages.
  3. Setting up the build environment, e.g. linking folder from /usr/src/packages to our working copy or installing compilers, headers and the like.
  4. Building the packages and installing them locally if they are a dependency of packages yet to be built.

Here is a shortened example of our build script:

#!/bin/bash

RPM_BUILD_ROOT=/usr/src/packages
if [ "i686" = `uname -m` ]
then
  ARCH=i586
else
  ARCH=`uname -m`
fi
SUSE_RELEASE=`cat /etc/SuSE-release | sed '/^[openSUSE|CODENAME]/d' | sed 's/VERSION =//g' | tr -d '[:blank:]' | sed 's/\.//g'`

source functions.sh

# setup build environment
ensureDirectoryLinks
# force a repository refresh without checking the signature
sudo zypper -n --no-gpg-checks refresh -f OUR_REPO
# remove previously built and installed packages
removeRPM libomniORB4.1
removeRPM omniNotify2
# install needed tools
installFromRepo c++-compiler
if [ $SUSE_RELEASE -lt 121 ]
then
  installFromRepo java-1_6_0-sun-devel
else
  installFromRepo jdk
fi
installFromRepo log4j
buildRPM omniORB
installRPM $ARCH/libomniORB4.1
installRPM $ARCH/omniORB-devel
installRPM $ARCH/omniORB-servers
buildAndInstallRPM omniNotify2 $ARCH

Deploying our packages via Jenkins

We setup a second Jenkins job to deploy successfully built RPM packages to our internal repository. We use the Copy Artifacts plugin to fetch the rpms from our build job and put them into a directory like all_rpms. Then we add a build step to execute a script like this:

for i in suse-12.1 suse-11.4 suse-11.3
do
  rm -rf $i
  mkdir -p $i
  versionlabel=`echo $i | sed 's/[-\.]//g'`
  cp -r "all_rpms/Architecture=32bit,OpenSuseRelease=$versionlabel/RPMS" $i
  cp -r "all_rpms/Architecture=64bit,OpenSuseRelease=$versionlabel/RPMS" $i
  cp -r "all_rpms/Architecture=64bit,OpenSuseRelease=$versionlabel/SRPMS" $i
  rsync -e "ssh" -avz $i/* root@rpmrepository.intranet:/srv/www/htdocs/OUR_REPO/$i/
  ssh root@rpmrepository.intranet "createrepo /srv/www/htdocs/OUR_REPO/$i/RPMS"

Summary

With a setup like this we can perform an automatic build of all our RPM packages on several targetplatform everytime we update one of the packages. After a successful build we can deploy our new packages to our RPM repository making them available for our whole organisation. There is an initial amount of work to be done but the rewards are easy, unattended package updates with deployment just one button click away.

Game of Life: TDD style in Java

I always got problems finding the right track with test driven development (TDD), going down the wrong track can get you stuck.
So here I document my experience with tdd-ing Conway’s Game of Life in Java.

I always got problems finding the right track with test driven development (TDD), going down the wrong track can get you stuck.
So here I document my experience with tdd-ing Conway’s Game of Life in Java.

The most important part of a game of life implementation since the rules are simple is the datastructure to store the living cells.
So using TDD we should start with it.
One feature of our cells should be that they are equal according to their coordinates:

@Test
public void positionsShouldBeEqualByValue() {
  assertEquals(at(0, 1), at(0, 1));
}

The JDK features a class holding two coordinates: java.awt.Point, so we can use it here:

public class Board {
  public static Point at(int x, int y) {
    return new Point(x, y);
  }
}

You could create your own Position or Cell class and implementing equals/hashCode accordingly but I want to keep things simple so we stick with Point.
A board should holding the living cells and we need to compare two boards according to their living cells:

@Test
public void boardShouldBeEqualByCells() {
  assertEquals(new Board(at(0, 1)), new Board(at(0, 1)));
}

Since we are only interested in living cells (all other cells are considered dead) we store only the living cells inside the board:

public class Board {
  private final Set<Point> alives;

  public Board(Point... points) {
    alives = new HashSet<Point>(Arrays.asList(points));
  }

  @Override
  public boolean equals(Object o) {
    if (this == o) return true;
    if (o == null || getClass() != o.getClass()) return false;

    Board board = (Board) o;

    if (alives != null ? !alives.equals(board.alives) : board.alives != null) return false;

    return true;
  }

  @Override
  public int hashCode() {
    return alives != null ? alives.hashCode() : 0;
  }
}

If you take a look at the rules you see that you need to have a way to count the neighbours of a cell:

@Test
public void neighbourCountShouldBeZeroWithoutNeighbours() {
  assertEquals(0, new Board(at(0, 1)).neighbours(at(0, 1)));
}

Easy:

public int neighbours(Point p) {
  return 0;
}

Neighbours are either vertically adjacent:

@Test
public void neighbourCountShouldCountVerticalOnes() {
  assertEquals(1, new Board(at(0, 0), at(0, 1)).neighbours(at(0, 1)));
}
public int neighbours(Point p) {
  int count = 0;
  for (int yDelta = -1; yDelta <= 1; yDelta++) {
    if (alives.contains(at(p.x, p.y + yDelta))) {
      count++;
    }
  }
  return count;
}

Hmm now both neighbour tests break, oh we forgot to not count the cell itself:
First the test…

@Test
public void neighbourCountShouldNotCountItself() {
  assertEquals(0, new Board(at(0, 0)).neighbours(at(0, 0)));
}

Then the fix:

public int neighbours(Point p) {
  int count = 0;
  for (int yDelta = -1; yDelta <= 1; yDelta++) {
    if (!(yDelta == 0) && alives.contains(at(p.x, p.y + yDelta))) {
      count++;
    }
  }
  return count;
}

And the horizontal adjacent ones:

@Test
public void neighbourCountShouldCountHorizontalOnes() {
  assertEquals(1, new Board(at(0, 1), at(1, 1)).neighbours(at(0, 1)));
}
public int neighbours(Point p) {
  int count = 0;
  for (int yDelta = -1; yDelta <= 1; yDelta++) {
    for (int xDelta = -1; xDelta <= 1; xDelta++) {
      if (!(xDelta == 0 && yDelta == 0) && alives.contains(at(p.x + xDelta, p.y + yDelta))) {
        count++;
      }
    }
  }
  return count;
}

And the diagonal ones are also included in our implementation:

@Test
public void neighbourCountShouldCountDiagonalOnes() {
  assertEquals(2, new Board(at(-1, 1), at(1, 0), at(0, 1)).neighbours(at(0, 1)));
}

So we set the stage for the rules. Rule 1: Cells with one neighbour should die:

@Test
public void cellWithOnlyOneNeighbourShouldDie() {
  assertEquals(new Board(), new Board(at(0, 0), at(0, 1)).next());
}

A simple implementation looks like this:

public Board next() {
  return new Board();
}

OK, on to Rule 2: A living cell with 2 neighbours should stay alive:

@Test
public void livingCellWithTwoNeighboursShouldStayAlive() {
  assertEquals(new Board(at(0, 0)), new Board(at(-1, -1), at(0, 0), at(1, 1)).next());
}

Now we need to iterate over each living cell and count its neighbours:

public class Board {
  public Board(Point... points) {
    this(new HashSet<Point>(Arrays.asList(points)));
  }

  private Board(Set<Point> points) {
    alives = points;
  }

  public Board next() {
    Set<Point> aliveInNext = new HashSet<Point>();
    for (Point cell : alives) {
      if (neighbours(cell) == 2 {
        aliveInNext.add(cell);
      }
    }
    return new Board(aliveInNext);
  }
}

In this step we added a convenience constructor to pass a set instead of some cells.
The last Rule: a cell with 3 neighbours should be born or stay alive (the pattern is called blinker, so we name the test after it):

@Test
public void blinker() {
  assertEquals(new Board(at(-1, 1), at(0, 1), at(1, 1)), new Board(at(0, 0), at(0, 1), at(0, 2)).next());
}

For this we need to look at all the neighbours of the living cells:

public Board next() {
  Set<Point> aliveInNext = new HashSet<Point>();
  for (Point cell : alives) {
    for (int yDelta = -1; yDelta <= 1; yDelta++) {
      for (int xDelta = -1; xDelta <= 1; xDelta++) {
        Point testingCell = at(cell.x + xDelta, cell.y + yDelta);
        if (neighbours(testingCell) == 2 || neighbours(testingCell) == 3) {
          aliveInNext.add(testingCell);
        }
      }
    }
  }
  return new Board(aliveInNext);
}

Now our previous test breaks, why? Well the second rule says: a *living* cell with 2 neighbours should stay alive:

public Board next() {
  Set<Point> aliveInNext = new HashSet<Point>();
  for (Point cell : alives) {
    for (int yDelta = -1; yDelta <= 1; yDelta++) {
      for (int xDelta = -1; xDelta <= 1; xDelta++) {
        Point testingCell = at(cell.x + xDelta, cell.y + yDelta);
        if ((alives.contains(testingCell) && neighbours(testingCell) == 2) || neighbours(testingCell) == 3) {
          aliveInNext.add(testingCell);
        }
      }
    }
  }
  return new Board(aliveInNext);
}

Done!
Now we can refactor and make the code cleaner like removing the logic duplication for iterating over the neighbours, adding methods like toString for output or better failing test messages, etc.