Softwaredistribution using own RPM-packages and repositories, part 1

Distributing and deploying your software in an Linux environment should be done through the packaging system of the distribution(s) in use. That way your users can use a uniform way of installing, updating and removing all the software on their machine. But where and how do you start?

Some of our clients use the RPM-based openSUSE distribution, so I want to cover our approach to packaging and providing RPMs.

A RPM-package is built from

  • an archive containing the buildable, vanilla sources
  • a SPEC-file describing the packaged software, the steps required to build the software and a changelog
  • optional patches needed to build and package the software for the target platform

The heart of the build process is the SPEC-file which is used by rpmbuild to actually build and package the software. In addition to the meta data and dependencies it structures the process into several steps:

  1. preparing the source by unpacking and patching
  2. building the project, e.g using configure and make
  3. installing the software into a defined directory
  4. packaging the installed files into the RPM
  5. cleanup after packaging

After creation of the SPEC-file (see my template.spec) the package can be built with the rpmbuild-tool. If everything goes right you will have your binary RPM-package after issuing the rpmbuild -bb SPECS/my-specfile.spec command. This rpm-package can already be used for distribution and installation on systems with the same distribution release as the build system. Extra care may be needed to make the package (or even the SPEC-file) work on different releases or distributions.

You will need a RPM-repository to distribute the packages so that standard system tools like yast2 or zypper can use and manage them, including updates and dependency resolution. There are three types of RPM-repositories:

  1. plain cache sources
  2. repomd/rpm md/YUM sources
  3. YaST sources

As option 2 “YUM sources” gives you the most bang for the buck we will briefly explain how to set up such a repository. Effectively, it only consists of the same specific directory structure like /usr/src/packages/RPMS on a webserver (like apache) and an index file. To create and update the repository, we simply perform the following steps:

  1. create the repository root directory on the webserver, e.g. mkdir -p /srv/www/htdocs/our_repo/RPMS
  2. copy our RPMS folder to the webserver using rsync or scp: scp -r /usr/src/packages/RPMS/* root@webserver:/srv/www/htdocs/our_repo/RPMS/
  3. create the repository index file using the createrepo-tool: ssh root@webserver "createrepo /srv/www/htdocs/our_repo/RPMS"

Now you can add the repository to your system using the URL http://webserver/our_repo/RPMS and use the familiar tools for installing and managing the software on your system.

In the next part I want to give additional advice and cover some pitfalls I encountered setting the whole thing up and packaging different software packages using different build systems.

In part 3 we will set up a jenkins build farm for building packages for different openSUSE releases on build slaves.

Different view on Apache Maven

Many people see Apache Maven as a build and dependency management tool. I see its strengths in other areas. Recently we had an in-house discussion about maven and I want to present my views here:

Pros

  • Maven standardizes your project layout and thus lowers the entry barrier for other developers.
  • Maven provides a IDE/tool agnostic way of describing a project and infrastructure to work with it. You get things like build and launch targets for free, depending on the archetype.
  • Archetypes (templates) for new projects make getting up to speed faster and easier.
  • Integration in many tools like continuous integration servers or IDEs is very good, so not much configuration work has to be done to get your project under test and supervision of analysis tools.
  • Ready-to-use plugins for many tasks.
  • Usable software distribution model helping in distibuted environments.

Cons

  • Big, ugly xml-specification (maven2, still need to check out the groovy and scala DSLs for poms) of the project.
  • Lacking documentation in some areas, e.g. certain plugins and best practices.
  • Once in a while “downloading the internet”-effect and auto-magic you need cope with.
  • Does not really solve dependency problems the way many people expect it.

So while you certainly can implement all wanted features of maven with other build and scripting systems and setup nice self-contained projects using maven can help you depending on your scenario. You have to know the strengths and weaknesses of your tools and try to decide accordingly. My experience is that you can get a basic project up and running in a healthy state very fast with maven. As the project grows the complexity will too and may outweigh the initial benefits. All tools require that you understand and use them well or they will stand in your way more and more. Especially using maven makes only sense if you adopt its style and conventions. If you strongly disagree there you will be happier with some solution like ant, cmake, gradle, ivy, make, sbt or the like providing more freedom by leaving more descisions up to you.

We are using different build and project description tools depending on the environment, involved technologies and project size and scope. Often this decision will not or cannot be changed so try to make a sensible decision considering all available information at hand.

Upgrading your app to Grails 2.0.0? Better wait for 2.0.1

Grails 2.0.0 is a major step forward for this popular and productive, JVM-based web framework. It has many great new features that make you want to migrate existing projects to this new version.

So I branched our project and started the migration process. Everything went smoothly and I had only to fix some minor compilation problems to get our application running again. Soon the first runtime errors occured and approximately 30 out of over 70 acceptance tests failed. Some analysis showed three major issue categories causing the failures:

  1. Saving domain objects with belongsTo() associations may fail with a NULL not allowed for column "AUTHOR_ID"; SQL statement: insert into book (id, version, author_id, name) values (null, ?, ?, ?) [90006-147] message due to grails issue GRAILS-8337. Setting the other direction of the association manually can act as a workaround:
    book.author.book = book
  2. When using the MarkupBuilder with the img tag in your TabLibs, your images may disappear. This is due to a new img closure defined in ApplicationTagLib. The correct fix is using
    delegate.img

    in your MarkupBuilder closures. See GRAILS-8660 for more information.

  3. Handling of null and the Groovy NullObject seems to be broken in some places. So we got org.codehaus.groovy.runtime.typehandling.GroovyCastException: Cannot cast object 'null' with class 'org.codehaus.groovy.runtime.NullObject' to class 'Note' using groovy collections’ find() and casting the result with as:
     Note myNote = notes?.find {it.title == aTitle} as Note

    Removing type information and the cast may act as a workaround. Unfortunately, we are not able to reproduce this issue in plain groovy and did not have time to extract a small grails example exhibiting the problem.

These bugs and some other changes may make you reconsider the migration of some bigger project at this point in time. Some of them are resolved already so 2.0.1 may be the release to wait for if you are planning a migration. We will keep an open eye on the next releases and try to switch to 2.0.x when our biggest show stoppers are resolved.

Even though I would advise against migrating bigger existing applications to Grails 2.0.0 I would start new projects on this – otherwise great – new platform release.

Deployment with the Play! framework

Play! is a great framework for java-base development of modern web applications. Unfortunately, the documentation about deployment options is not really that extensive in certain details. I want to describe a way to automatically build a self-contained zip archive without the source code. The documentation does state that using the standalone web server is preferred so we will use that option.

Our goal is:

  • an artifact with the executable application
  • no sources in the artifact
  • startup script for different platform and environments
  • CI integration with execution of the tests

Fortunately, the play framework makes most of this quite easy if you know some small tricks.

The first very important step towards our goal is embedding the whole Play! framework somewhere in your project directory. I like to put it into lib/play-x.y.z (x.y.z being the framework version). That way you can do perform all neccessary calls to play scripts using relative paths and provide a self-contained artifact which developers or clients may download and execute on their machine. You can also be sure everyone is using the correct (read “same”) framework version.

The next important thing is to write some small start-scripts so you can demo the software easily on any machine with Java installed. Your clients may try it out theirselves if the project policy is open enough. Here are small examples for linux

#!/bin/sh
python lib/play-1.2.3/play run --%demo -Dprecompiled=true

and windows

REM start our app in the "demo" environment
lib\play-1.2.3\play run --%%demo -Dprecompiled=true

The last ingredient to a great deployment and demoing experience is the build script which builds, tests and packages the software together. We do not want to include the sources in the artifact, so there is a bit of work to do. We perform following steps in the script:

  1. delete old artifacts to ensure a clean build
  2. call play to precompile our application
  3. call play to execute all our automatic tests
  4. copy all needed files into our distribution directory ready to be packed together
  5. pack the artifacts into a zip archive

Our sample build script is for the linux shell but you can easily translate it to the scripting environment of your choice, be it apache ant, gradle, windows batch depending on your needs and preference:

#!/bin/sh

rm -r dist
rm -r test-result
rm -r precompiled
python lib/play-1.2.3/play precompile
python lib/play-1.2.3/play auto-test
TARGET=dist/my_project
mkdir -p $TARGET/app
cp -r app/views $TARGET/app
cp -r conf lib modules precompiled public $TARGET
cp programs/my_project* $TARGET
cd dist && zip -r my_project.zip my_project

Now we can hook the project into a continuous integration server like Jenkins and let it archive the build artifact containing an executable installation of our web application. You could grant your client direct access to the artifact, use it for demos and further deployment steps like triggered upload to a staging server or the like.

Inconsistent usage of type definitions kills portability in C/C++

C/C++ are nice low level, high-performance languages if you need to be “close to the metal” due to performance or memory constraints. Nevertheless, C/C++ are portable languages because they provide datatypes to abstract from the underlying hardware and compilers for virtually every hardware platform. It is very easy to kill portability because the specification allows certain platform-dependent sizes for the built-in datatypes. An int can be 16 bits or 32 bits, a long 32 bits or 64 bit and so on. Many programmers mitigate the issue by defining own datatypes like WORD or uint32. If you are using some libraries it is very likely that several such type definitions are available and often times interchangeable.

It is absolutely crucial to be consistent when using the type definitions and usually a good advice not to use the built-in types like int or long because they will change in size on different platforms.

In one of our projects we are working on Tango device servers written in C++ and use the YAT (Yet Another Toolbox) library. There are at least 3 possible ways in this small project to define a (most of the time) unsigned 32 bit word:

  • yat::uint32
  • unsigned long
  • tango::DevUlong

To make matters a bit more interesting the tango::DevUlong is defined by the CORBA C++ mapping. Code using all of these definitions may work on a certain platform so you will not notice the problem right away, but we had several compilation problems and even program crashes when compiling or running on Linux/x86, Linux/AMD64, Windows 7 32bit and Windows 7 64bit. Using types that guarantee their size on all platforms and consistent usage of them will make your code compile and run on many platforms flawlessly.

Tests may remember the spec better than the customer or yourself

We have an application in maintenance mode for some years now. One part of the app displays messages in a certain format. They contained %-characters which have a special meaning. Both we and our customer thought they were about encoding line endings or some such. One day our customer reported missing parts within these messages. We dove down into the issue, analysed the raw messages containing a few %-signs and noticed some weird looking code:

public String parse(String message) {
    StringChunker tok = new StringChunker(message, Text.PERCENT);
    DirectChunkBuffer result = new DirectChunkBuffer(Text.NEWLINE);
    if (tok.hasMoreChunks()) {
        result.add(tok.getNextChunk());
    }
    return result.toString();
}

The if-statement feels unusual here as most would expect a while loop essentially splitting the original message by % and putting it together again with newlines in between. Almost immediately we thought of a bug that never until now occurred in production triggered by malformed raw messages.

But our unit tests documented clearly the current behaviour as correct. So we decided to talk again with our customer. He then asked his experts and they confirmed the behaviour and explained their workflow. The %-characters were used as comment characters to hide text blocks the expert workers used as templates. Nothing after the first %-character should be displayed. They also confirmed that the displayed message was correct and the whole error report was indeed some kind of communication problem somewhere in the organisation.

The tests saved us from breaking specified and correctly working behaviour.

After the clarification by the experts and we improved the situation by refactoring the code to communicate its intent clearer. We also documented the message format in the javadocs and a wikipage in addition to the tests.

Prepare for the unexpected

In most larger projects there are many details which cannot be foreseen by the development team. Specifications turn out to be wrong, incomplete or not precise enough for your implementation to work without further adjustments. New features have to work with production data that may not be available in your development or testing environment.

The result I often observed is that everything works fine in your environment including great automated tests but fails nevertheless when deployed to production systems. Sometimes it is minor differences in the operating system version or configuration, the locale for example, may cause your software to fail. Another common problem is  real production data containing unexpected characters, inconsistencies in the data (sometimes due to bugs) or its sheer size.

What can we do to better prepare for unexpected issues after deployment?

The thing is to expect such issues and to implement certain countermeasures to better cope with them. This may conflict with the KISS principle but usually is worth a bit of added complexity. I want to provide some advice which proved useful for us in the past and may help you in the future too:

  1. Provide good, detailed and persistent debug output for certain features: Once we added a complex rule system which operated on existing domain objects. To check every possible combination of domain object states would have been a ton of work, so we wrote tests for the common cases and difficult cases we could think of. Since the correctness of the functionality was not critical we decided to rather display slightly incorrect information instead of failing and thus breaking the feature for the user. We did however provide extensive and detailed logs whenever our rule system detected a problem.
  2. Make certain parts of your communication interface to third party systems configurable: Often your system communicates to different kinds of users and other systems. Common examples are import/export functionality, web service APIs or text protocols. Even if most of the time details like date and number formats, data separators, line endings, character encoding and so forth are specified it often proves valuable to make them configurable. Many times the specification changes or is incorrect, some communication partner implements the protocol slightly different or a format deviates from your assumption breaking your application. It is great if you can change that with a smile in front of your client and make the whole thing work in minutes instead of walking home frustrated to fix the issues.

The above does not mean building applications with ultimate flexibility and configurability and ignoring automated tests or realistic test environments. It just means that there are typical aspects of an application where you can prepare for otherwise unexpected deviations of theory and praxis.

Using Groovy? Prepare for the unexpected!

Disclaimer

This post is not intended as some kind of Groovy-bashing. It rather points out some of the common problems and maybe differences in the mindset between Java and Groovy developers.

Groovy powers Grails and Grails empowers us to build modern web application conveniently and fast. We have come to love many of the features of Groovy and Grails. Sadly, every once in a while you stand puzzled before some bizarre error message. Often it is our own fault, but sometimes it is a real problem in the software stack we are using. Grails was quite buggy some time ago (pre 1.2) but we are more and more happy with its development. Groovy itself has been quite solid for a long time but things like dynamic dispatch with null values, boolean truth  and its  NullObjects (e.g. null + null == "nullnull"!) may yield some surprises.

A few days ago we received an failure report from a client and found the corresponding exception in our logs:

java.lang.UnsupportedOperationException
at $Proxy12.toString(Unknown Source)
at java.lang.Throwable.getLocalizedMessage(Throwable.java:267)
at java.lang.Throwable.toString(Throwable.java:343)
at java.lang.String.valueOf(String.java:2827)
at org.apache.commons.logging.impl.SLF4JLocationAwareLog.error(SLF4JLocationAwareLog.java:211)
at org.apache.commons.logging.Log$error$0.call(Unknown Source)
at UserGroupController.callSendMail(UserGroupController.groovy:117)

You may be able to see that the UnsupportOperationException is thrown in an attempt to log a Throwable. Yeah right, toString() of some generated proxy object throws an exception. This is not exactly what you would expect from a call to the toString()-method and violates the principle of least astonishment. Equally bad in this case is he fact that it shadows the real cause of the problem. Some research revealed that this is a bug in Groovy itself.

This leads me to my highly speculative and opionated “mindset” hypothesis. In the java world people try to avoid the unexpected at the cost of clunkyness and verbosity. Examples for this mindset are static typing and checked exceptions. Sources of uncertainty, like reflection and downcasts, are minimized and put in defined areas of code often documented and annotated. Everyone tries to strictly define everything and relies on these definition. All this sums up to quite a burden and may feel like a cage.

Enter Groovy which removes a lot of the burden by its more liberal syntax and other language features. Suddenly, you are free and do not have to define everything and life feels a lot easier at first. You do not need well designed class hierarchies, just use duck typing. Everything is dynamic and you can cope with things at runtime. Not only the above bug (it is still a bug, not intended behaviour!) but also the handling of boolean values or null may make the seasoned java developer shout “WTF!” from time to time.

So what is the bottom line now? Both worlds have their pros and cons. In Groovy fundamental things can surprise you at runtime so there is an even greater need for good test coverage and a thorough knowledge of the language. Groovy (and Grails for that part) are not as easy as they seem in smaller projects when going big. The Java camp has a solid base but should strive to make life more convenient for the developer. It is mostly the verbosity and some missing features like type inference or closures that are driving people away from Java. Imho both languages and the Java platform with its great community and eco-system have a bright future ahead but there are some bumps on the way ahead.

Story about bogus error messages

Most computer users know the situation where some system or service that worked for months without problems suddenly stops working. I want to tell you of a small war story which happened some weeks ago to us.

We are maintaining an own mail server with imap access for our employees. That allows for relativly easy serverside spam protection, mailing list management and archiving and so on. We use the trusted combination of postfix, cyrus and mailman for the task and everything works very reliably. Then suddenly we got the error message ssl_error_rx_record_too_long in our e-mail clients. Nothing had changed software-wise. Googleing on the internet brought up all kinds of different obscure reasons for this error but no explanation why something like that would happen out of thin air.

Fortunately, looking in cyrus’ log files quickly showed the reason: the hard drive was full! A two days before the mail system failing there was a larger upload to the server for sharing stuff with a colleague. This upload fitted almost exactly onto the free disk space and some mails later the disk was full. It was really a murphy’s law situation because some kilo bytes less free space would have made the file sharing fail with a sensible error message. But it worked and made the mail system fail suddenly without immediate connection some changes to the server.

There are some lessons to be learned here:

  • Aside from file managers most applications assume memory and disk space are unlimited. If they do hit such a limit they usually fail miserably with complete bogus errors.
  • Monitor critical resources on important systems to receive warnings ahead of time before important service fail. Tools like Nagios can help here.
  • Try to be aware of side effects of your actions. Separating services to different machines may help to reduce unexpected side effects on seemingly unrelated stuff. We used the server to run many different unrelated services.

GORM-Performance with Collections

The other day I was looking to improve the performance of specific parts of our Grails application. I quickly found the typical bottleneck in database centric Grails apps: Too many queries were executed because GORM hides away database queries by its built-in persistence methods for domain objects and the extremely nice dynamic finders. In search for improvements and places to use GORM/Hibernate caching I stumbled upon a very good and helpful presentation on GORM-performance in general and especially collection usage. Burt Beckwith presents some common problems and good patterns to overcome them in his SpringOne 2GX talk. I highly recommend having a thorough look at his presentation.

Nevertheless, I want to summarize his bottom line here: GORM does provide a nice abstraction from relational databases but this abstraction is leaky at times. So you have to know exactly how the stuff in your domain classes is mapped. Be especially careful it collections tend to become “large” because performance will suffer extremely. We already observed a significant performance degradation for some dozen elements; your mileage may vary. For many simple modifications on a collection all its elements have to be loaded from the database!
Instead of using hasMany/belongsTo just add a back reference to the domain object your object belongs to. With the collection you lose cascading delete and some GORM functionality but you can still use dynamic finders and put the functionality to manage associations yourself into respective classes. This may be a large gain in specific cases!