Upgrading your app to Grails 2.0.0? Better wait for 2.0.1

Grails 2.0.0 is a major step forward for this popular and productive, JVM-based web framework. It has many great new features that make you want to migrate existing projects to this new version.

So I branched our project and started the migration process. Everything went smoothly and I had only to fix some minor compilation problems to get our application running again. Soon the first runtime errors occured and approximately 30 out of over 70 acceptance tests failed. Some analysis showed three major issue categories causing the failures:

  1. Saving domain objects with belongsTo() associations may fail with a NULL not allowed for column "AUTHOR_ID"; SQL statement: insert into book (id, version, author_id, name) values (null, ?, ?, ?) [90006-147] message due to grails issue GRAILS-8337. Setting the other direction of the association manually can act as a workaround:
    book.author.book = book
  2. When using the MarkupBuilder with the img tag in your TabLibs, your images may disappear. This is due to a new img closure defined in ApplicationTagLib. The correct fix is using
    delegate.img

    in your MarkupBuilder closures. See GRAILS-8660 for more information.

  3. Handling of null and the Groovy NullObject seems to be broken in some places. So we got org.codehaus.groovy.runtime.typehandling.GroovyCastException: Cannot cast object 'null' with class 'org.codehaus.groovy.runtime.NullObject' to class 'Note' using groovy collections’ find() and casting the result with as:
     Note myNote = notes?.find {it.title == aTitle} as Note

    Removing type information and the cast may act as a workaround. Unfortunately, we are not able to reproduce this issue in plain groovy and did not have time to extract a small grails example exhibiting the problem.

These bugs and some other changes may make you reconsider the migration of some bigger project at this point in time. Some of them are resolved already so 2.0.1 may be the release to wait for if you are planning a migration. We will keep an open eye on the next releases and try to switch to 2.0.x when our biggest show stoppers are resolved.

Even though I would advise against migrating bigger existing applications to Grails 2.0.0 I would start new projects on this – otherwise great – new platform release.

Deployment with the Play! framework

Play! is a great framework for java-base development of modern web applications. Unfortunately, the documentation about deployment options is not really that extensive in certain details. I want to describe a way to automatically build a self-contained zip archive without the source code. The documentation does state that using the standalone web server is preferred so we will use that option.

Our goal is:

  • an artifact with the executable application
  • no sources in the artifact
  • startup script for different platform and environments
  • CI integration with execution of the tests

Fortunately, the play framework makes most of this quite easy if you know some small tricks.

The first very important step towards our goal is embedding the whole Play! framework somewhere in your project directory. I like to put it into lib/play-x.y.z (x.y.z being the framework version). That way you can do perform all neccessary calls to play scripts using relative paths and provide a self-contained artifact which developers or clients may download and execute on their machine. You can also be sure everyone is using the correct (read “same”) framework version.

The next important thing is to write some small start-scripts so you can demo the software easily on any machine with Java installed. Your clients may try it out theirselves if the project policy is open enough. Here are small examples for linux

#!/bin/sh
python lib/play-1.2.3/play run --%demo -Dprecompiled=true

and windows

REM start our app in the "demo" environment
lib\play-1.2.3\play run --%%demo -Dprecompiled=true

The last ingredient to a great deployment and demoing experience is the build script which builds, tests and packages the software together. We do not want to include the sources in the artifact, so there is a bit of work to do. We perform following steps in the script:

  1. delete old artifacts to ensure a clean build
  2. call play to precompile our application
  3. call play to execute all our automatic tests
  4. copy all needed files into our distribution directory ready to be packed together
  5. pack the artifacts into a zip archive

Our sample build script is for the linux shell but you can easily translate it to the scripting environment of your choice, be it apache ant, gradle, windows batch depending on your needs and preference:

#!/bin/sh

rm -r dist
rm -r test-result
rm -r precompiled
python lib/play-1.2.3/play precompile
python lib/play-1.2.3/play auto-test
TARGET=dist/my_project
mkdir -p $TARGET/app
cp -r app/views $TARGET/app
cp -r conf lib modules precompiled public $TARGET
cp programs/my_project* $TARGET
cd dist && zip -r my_project.zip my_project

Now we can hook the project into a continuous integration server like Jenkins and let it archive the build artifact containing an executable installation of our web application. You could grant your client direct access to the artifact, use it for demos and further deployment steps like triggered upload to a staging server or the like.

Inconsistent usage of type definitions kills portability in C/C++

C/C++ are nice low level, high-performance languages if you need to be “close to the metal” due to performance or memory constraints. Nevertheless, C/C++ are portable languages because they provide datatypes to abstract from the underlying hardware and compilers for virtually every hardware platform. It is very easy to kill portability because the specification allows certain platform-dependent sizes for the built-in datatypes. An int can be 16 bits or 32 bits, a long 32 bits or 64 bit and so on. Many programmers mitigate the issue by defining own datatypes like WORD or uint32. If you are using some libraries it is very likely that several such type definitions are available and often times interchangeable.

It is absolutely crucial to be consistent when using the type definitions and usually a good advice not to use the built-in types like int or long because they will change in size on different platforms.

In one of our projects we are working on Tango device servers written in C++ and use the YAT (Yet Another Toolbox) library. There are at least 3 possible ways in this small project to define a (most of the time) unsigned 32 bit word:

  • yat::uint32
  • unsigned long
  • tango::DevUlong

To make matters a bit more interesting the tango::DevUlong is defined by the CORBA C++ mapping. Code using all of these definitions may work on a certain platform so you will not notice the problem right away, but we had several compilation problems and even program crashes when compiling or running on Linux/x86, Linux/AMD64, Windows 7 32bit and Windows 7 64bit. Using types that guarantee their size on all platforms and consistent usage of them will make your code compile and run on many platforms flawlessly.

Tests may remember the spec better than the customer or yourself

We have an application in maintenance mode for some years now. One part of the app displays messages in a certain format. They contained %-characters which have a special meaning. Both we and our customer thought they were about encoding line endings or some such. One day our customer reported missing parts within these messages. We dove down into the issue, analysed the raw messages containing a few %-signs and noticed some weird looking code:

public String parse(String message) {
    StringChunker tok = new StringChunker(message, Text.PERCENT);
    DirectChunkBuffer result = new DirectChunkBuffer(Text.NEWLINE);
    if (tok.hasMoreChunks()) {
        result.add(tok.getNextChunk());
    }
    return result.toString();
}

The if-statement feels unusual here as most would expect a while loop essentially splitting the original message by % and putting it together again with newlines in between. Almost immediately we thought of a bug that never until now occurred in production triggered by malformed raw messages.

But our unit tests documented clearly the current behaviour as correct. So we decided to talk again with our customer. He then asked his experts and they confirmed the behaviour and explained their workflow. The %-characters were used as comment characters to hide text blocks the expert workers used as templates. Nothing after the first %-character should be displayed. They also confirmed that the displayed message was correct and the whole error report was indeed some kind of communication problem somewhere in the organisation.

The tests saved us from breaking specified and correctly working behaviour.

After the clarification by the experts and we improved the situation by refactoring the code to communicate its intent clearer. We also documented the message format in the javadocs and a wikipage in addition to the tests.

Prepare for the unexpected

In most larger projects there are many details which cannot be foreseen by the development team. Specifications turn out to be wrong, incomplete or not precise enough for your implementation to work without further adjustments. New features have to work with production data that may not be available in your development or testing environment.

The result I often observed is that everything works fine in your environment including great automated tests but fails nevertheless when deployed to production systems. Sometimes it is minor differences in the operating system version or configuration, the locale for example, may cause your software to fail. Another common problem is  real production data containing unexpected characters, inconsistencies in the data (sometimes due to bugs) or its sheer size.

What can we do to better prepare for unexpected issues after deployment?

The thing is to expect such issues and to implement certain countermeasures to better cope with them. This may conflict with the KISS principle but usually is worth a bit of added complexity. I want to provide some advice which proved useful for us in the past and may help you in the future too:

  1. Provide good, detailed and persistent debug output for certain features: Once we added a complex rule system which operated on existing domain objects. To check every possible combination of domain object states would have been a ton of work, so we wrote tests for the common cases and difficult cases we could think of. Since the correctness of the functionality was not critical we decided to rather display slightly incorrect information instead of failing and thus breaking the feature for the user. We did however provide extensive and detailed logs whenever our rule system detected a problem.
  2. Make certain parts of your communication interface to third party systems configurable: Often your system communicates to different kinds of users and other systems. Common examples are import/export functionality, web service APIs or text protocols. Even if most of the time details like date and number formats, data separators, line endings, character encoding and so forth are specified it often proves valuable to make them configurable. Many times the specification changes or is incorrect, some communication partner implements the protocol slightly different or a format deviates from your assumption breaking your application. It is great if you can change that with a smile in front of your client and make the whole thing work in minutes instead of walking home frustrated to fix the issues.

The above does not mean building applications with ultimate flexibility and configurability and ignoring automated tests or realistic test environments. It just means that there are typical aspects of an application where you can prepare for otherwise unexpected deviations of theory and praxis.

Using Groovy? Prepare for the unexpected!

Disclaimer

This post is not intended as some kind of Groovy-bashing. It rather points out some of the common problems and maybe differences in the mindset between Java and Groovy developers.

Groovy powers Grails and Grails empowers us to build modern web application conveniently and fast. We have come to love many of the features of Groovy and Grails. Sadly, every once in a while you stand puzzled before some bizarre error message. Often it is our own fault, but sometimes it is a real problem in the software stack we are using. Grails was quite buggy some time ago (pre 1.2) but we are more and more happy with its development. Groovy itself has been quite solid for a long time but things like dynamic dispatch with null values, boolean truth  and its  NullObjects (e.g. null + null == "nullnull"!) may yield some surprises.

A few days ago we received an failure report from a client and found the corresponding exception in our logs:

java.lang.UnsupportedOperationException
at $Proxy12.toString(Unknown Source)
at java.lang.Throwable.getLocalizedMessage(Throwable.java:267)
at java.lang.Throwable.toString(Throwable.java:343)
at java.lang.String.valueOf(String.java:2827)
at org.apache.commons.logging.impl.SLF4JLocationAwareLog.error(SLF4JLocationAwareLog.java:211)
at org.apache.commons.logging.Log$error$0.call(Unknown Source)
at UserGroupController.callSendMail(UserGroupController.groovy:117)

You may be able to see that the UnsupportOperationException is thrown in an attempt to log a Throwable. Yeah right, toString() of some generated proxy object throws an exception. This is not exactly what you would expect from a call to the toString()-method and violates the principle of least astonishment. Equally bad in this case is he fact that it shadows the real cause of the problem. Some research revealed that this is a bug in Groovy itself.

This leads me to my highly speculative and opionated “mindset” hypothesis. In the java world people try to avoid the unexpected at the cost of clunkyness and verbosity. Examples for this mindset are static typing and checked exceptions. Sources of uncertainty, like reflection and downcasts, are minimized and put in defined areas of code often documented and annotated. Everyone tries to strictly define everything and relies on these definition. All this sums up to quite a burden and may feel like a cage.

Enter Groovy which removes a lot of the burden by its more liberal syntax and other language features. Suddenly, you are free and do not have to define everything and life feels a lot easier at first. You do not need well designed class hierarchies, just use duck typing. Everything is dynamic and you can cope with things at runtime. Not only the above bug (it is still a bug, not intended behaviour!) but also the handling of boolean values or null may make the seasoned java developer shout “WTF!” from time to time.

So what is the bottom line now? Both worlds have their pros and cons. In Groovy fundamental things can surprise you at runtime so there is an even greater need for good test coverage and a thorough knowledge of the language. Groovy (and Grails for that part) are not as easy as they seem in smaller projects when going big. The Java camp has a solid base but should strive to make life more convenient for the developer. It is mostly the verbosity and some missing features like type inference or closures that are driving people away from Java. Imho both languages and the Java platform with its great community and eco-system have a bright future ahead but there are some bumps on the way ahead.

Story about bogus error messages

Most computer users know the situation where some system or service that worked for months without problems suddenly stops working. I want to tell you of a small war story which happened some weeks ago to us.

We are maintaining an own mail server with imap access for our employees. That allows for relativly easy serverside spam protection, mailing list management and archiving and so on. We use the trusted combination of postfix, cyrus and mailman for the task and everything works very reliably. Then suddenly we got the error message ssl_error_rx_record_too_long in our e-mail clients. Nothing had changed software-wise. Googleing on the internet brought up all kinds of different obscure reasons for this error but no explanation why something like that would happen out of thin air.

Fortunately, looking in cyrus’ log files quickly showed the reason: the hard drive was full! A two days before the mail system failing there was a larger upload to the server for sharing stuff with a colleague. This upload fitted almost exactly onto the free disk space and some mails later the disk was full. It was really a murphy’s law situation because some kilo bytes less free space would have made the file sharing fail with a sensible error message. But it worked and made the mail system fail suddenly without immediate connection some changes to the server.

There are some lessons to be learned here:

  • Aside from file managers most applications assume memory and disk space are unlimited. If they do hit such a limit they usually fail miserably with complete bogus errors.
  • Monitor critical resources on important systems to receive warnings ahead of time before important service fail. Tools like Nagios can help here.
  • Try to be aware of side effects of your actions. Separating services to different machines may help to reduce unexpected side effects on seemingly unrelated stuff. We used the server to run many different unrelated services.

GORM-Performance with Collections

The other day I was looking to improve the performance of specific parts of our Grails application. I quickly found the typical bottleneck in database centric Grails apps: Too many queries were executed because GORM hides away database queries by its built-in persistence methods for domain objects and the extremely nice dynamic finders. In search for improvements and places to use GORM/Hibernate caching I stumbled upon a very good and helpful presentation on GORM-performance in general and especially collection usage. Burt Beckwith presents some common problems and good patterns to overcome them in his SpringOne 2GX talk. I highly recommend having a thorough look at his presentation.

Nevertheless, I want to summarize his bottom line here: GORM does provide a nice abstraction from relational databases but this abstraction is leaky at times. So you have to know exactly how the stuff in your domain classes is mapped. Be especially careful it collections tend to become “large” because performance will suffer extremely. We already observed a significant performance degradation for some dozen elements; your mileage may vary. For many simple modifications on a collection all its elements have to be loaded from the database!
Instead of using hasMany/belongsTo just add a back reference to the domain object your object belongs to. With the collection you lose cascading delete and some GORM functionality but you can still use dynamic finders and put the functionality to manage associations yourself into respective classes. This may be a large gain in specific cases!

Simple code, subtle bugs: write unit tests!

We developed an applet some years ago that was supposed to run 24/7. Basically, it fetches some values periodically and plots them in a chart. It always shows the last 24 hours. Everything seemed to work fine but every few weeks it crashed. The problems were clouded by hardware instabilities and the use of some obscure JVM. After quite a bit of analysis it became clear that it was a bug in our applet: a memory leak. Here is the troublemaker code:

for (int i = 0; i < dataSeries.getItemCount(); i++) {
    XYDataItem dataItem = dataSeries.getDataItem(i);
    if (dataItem.getX().longValue() < startDate.getTime().getTime()) {
        dataSeries.remove(i);
    } else {
        break;
    }
}
dataSeries.add(newDataPoint);

This simple and innocently looking piece of code essentially removes old items from a list by index. Many of you may already have spotted the main problem leading to the leak. Removing items by index means that following items shift one place towards the head of the list. As the index is incremented one element is skipped by the next iteration. This added up over time and lead to an OutOfMemoryError after some weeks.

Now, even if the code is not great it does relatively look straightforward on first sight, yet it is not and is contains errors. Making things worse, the code was buried somewhere in some tangled logic. This leads me to the point of my post:

Write unit tests even for relatively simple code.

Most likely writing a unit test for this cleanup work would have led to a class that manages the dataSeries and nothing more. Quite easy to understand and test in isolation. The problem would never have slipped into production and caused months of  investigating different stability problems.

The programmer may not have been aware of iterators but he should have made sure that this block works as intended. The best way to do this is automated unit tests. They make sure that your building blocks are as solid as you expect them to be. Use acceptance testing to make sure you have put your building blocks together the right way. Together unit and acceptance tests will save countless hours hunting regressions.

Reversing an array in Java

Reversing an array is a popular interview question especially in languages like C. Some days ago I faced the problem in a Java legacy project I was maintaining. As a Java guy I did not want to fiddle with a for-loop and indexes. So I looked for another solution. Besides showing the one-liner I want to give some insights which some might not be aware of. Here is my solution:

public static <T> T[] reverse(T[] array) {
    Collections.reverse(Arrays.asList(array));
    return array;
}
  1. This approach works, because the ArrayList implementation Arrays.asList() returns a specially tailored ArrayList which writes the changes right through to the backing array. It is not a java.util.ArrayList!
  2. The above means that the array is changed in-place and no array copying is involved. The approach thus has good performance it that matters.
  3. My solution directly changes the given array. To make it free of side-effects you could create a new array and copy the original one into it before converting to a list and reversing.

Side note regarding Arrays.asList()

Arrays.asList() is nice for bridging to old code and APIs. It has to be used with caution though: If you read the API documentation and keep 1.) in mind you will notice, that the returned list is fixed size. So calls to add() and remove() will fail with an UnsupportedOperationException! Passing such lists around in your system using the java.util.List interface may lead to unexpected behaviour since most people expect lists to be of variable sized in Java. In our use case we do not return the resulting list to the caller but only use it temporarily and locally. So this is no problem here.

Conclusion

Reversing an array can be a nice interview question for Java candidates too as you can discuss about in-place reversing, memory and time complexity and different approaches. It may show their knowledge of the collection API and the utility classes. If the above approach is discovered there is also something to discuss as depicted in this article.