Deployment with the Play! framework

Play! is a great framework for java-base development of modern web applications. Unfortunately, the documentation about deployment options is not really that extensive in certain details. I want to describe a way to automatically build a self-contained zip archive without the source code. The documentation does state that using the standalone web server is preferred so we will use that option.

Our goal is:

  • an artifact with the executable application
  • no sources in the artifact
  • startup script for different platform and environments
  • CI integration with execution of the tests

Fortunately, the play framework makes most of this quite easy if you know some small tricks.

The first very important step towards our goal is embedding the whole Play! framework somewhere in your project directory. I like to put it into lib/play-x.y.z (x.y.z being the framework version). That way you can do perform all neccessary calls to play scripts using relative paths and provide a self-contained artifact which developers or clients may download and execute on their machine. You can also be sure everyone is using the correct (read “same”) framework version.

The next important thing is to write some small start-scripts so you can demo the software easily on any machine with Java installed. Your clients may try it out theirselves if the project policy is open enough. Here are small examples for linux

#!/bin/sh
python lib/play-1.2.3/play run --%demo -Dprecompiled=true

and windows

REM start our app in the "demo" environment
lib\play-1.2.3\play run --%%demo -Dprecompiled=true

The last ingredient to a great deployment and demoing experience is the build script which builds, tests and packages the software together. We do not want to include the sources in the artifact, so there is a bit of work to do. We perform following steps in the script:

  1. delete old artifacts to ensure a clean build
  2. call play to precompile our application
  3. call play to execute all our automatic tests
  4. copy all needed files into our distribution directory ready to be packed together
  5. pack the artifacts into a zip archive

Our sample build script is for the linux shell but you can easily translate it to the scripting environment of your choice, be it apache ant, gradle, windows batch depending on your needs and preference:

#!/bin/sh

rm -r dist
rm -r test-result
rm -r precompiled
python lib/play-1.2.3/play precompile
python lib/play-1.2.3/play auto-test
TARGET=dist/my_project
mkdir -p $TARGET/app
cp -r app/views $TARGET/app
cp -r conf lib modules precompiled public $TARGET
cp programs/my_project* $TARGET
cd dist && zip -r my_project.zip my_project

Now we can hook the project into a continuous integration server like Jenkins and let it archive the build artifact containing an executable installation of our web application. You could grant your client direct access to the artifact, use it for demos and further deployment steps like triggered upload to a staging server or the like.

Prepare for the unexpected

In most larger projects there are many details which cannot be foreseen by the development team. Specifications turn out to be wrong, incomplete or not precise enough for your implementation to work without further adjustments. New features have to work with production data that may not be available in your development or testing environment.

The result I often observed is that everything works fine in your environment including great automated tests but fails nevertheless when deployed to production systems. Sometimes it is minor differences in the operating system version or configuration, the locale for example, may cause your software to fail. Another common problem is  real production data containing unexpected characters, inconsistencies in the data (sometimes due to bugs) or its sheer size.

What can we do to better prepare for unexpected issues after deployment?

The thing is to expect such issues and to implement certain countermeasures to better cope with them. This may conflict with the KISS principle but usually is worth a bit of added complexity. I want to provide some advice which proved useful for us in the past and may help you in the future too:

  1. Provide good, detailed and persistent debug output for certain features: Once we added a complex rule system which operated on existing domain objects. To check every possible combination of domain object states would have been a ton of work, so we wrote tests for the common cases and difficult cases we could think of. Since the correctness of the functionality was not critical we decided to rather display slightly incorrect information instead of failing and thus breaking the feature for the user. We did however provide extensive and detailed logs whenever our rule system detected a problem.
  2. Make certain parts of your communication interface to third party systems configurable: Often your system communicates to different kinds of users and other systems. Common examples are import/export functionality, web service APIs or text protocols. Even if most of the time details like date and number formats, data separators, line endings, character encoding and so forth are specified it often proves valuable to make them configurable. Many times the specification changes or is incorrect, some communication partner implements the protocol slightly different or a format deviates from your assumption breaking your application. It is great if you can change that with a smile in front of your client and make the whole thing work in minutes instead of walking home frustrated to fix the issues.

The above does not mean building applications with ultimate flexibility and configurability and ignoring automated tests or realistic test environments. It just means that there are typical aspects of an application where you can prepare for otherwise unexpected deviations of theory and praxis.

An Oracle story: Null, empty or what?

One big argument for relational databases is SQL which as a standard minimizes the effort needed to switch your app between different DBMSes. This comes particularily handy when using in-memory databases (like HSQL or H2) for development and a “big” one (like PostgreSQL, MySQL, DB2, MS SqlServer or Oracle) in production. The pity is that there are subtle differences with regard to the interpretation of the SQL-standard when it comes to databases from different vendors.

Oracle is particularily picky and offers quite some interesting behaviours: Most databases (all that I know well) treat null and empty as different values when it comes to strings. So it is perfectly valid to store an empty string in a not-null column and retrieving the string from the column yields an empty string. Not so with Oracle 10g! Inserting null and retrieving the value yields unsurprisingly null, even using Oracle. Inserting an empty string and retrieving the value leaves you with null, too! Oracle does not differentiate between empty strings and null values like a Java developer would expect. In our environment this has led to surprised developers and locally unreproducible bug which clearly exist in production a couple of times.

[rant]Oracle has great features for big installations and enterprises that can afford the support, maintenance and hardware of a serious Oracle DBMS installation. But IMHO it is a shame that such a big player in the market does not really care about the shortcomings of their flagship product and standards in general (Oracle 10g only supports SQL92 entry level!). Oracle, please fix such issues and help us developers to get rid of special casing for your database product![/rant]

The lesson to be learnt here is that you need a clone of the production database for your integration tests or acceptance tests to be really effective. Quite some bugs have slipped into production because of subtle differences in behaviour of our inhouse databases and the ones in production at the customer site.

Lightweight dependency management

Managing project dependencies without maven or ant ivy, using a custom ant task to ensure classpath orthogonality.

Java’s classpath is a powerful concept – when used appropriate. As your project grows larger in terms of code and people, it gets harder to ensure that your classpath is correct. A great danger arises from JAR files containing different versions of the same resource. You might end up running different code than you think, leading to strange effects. If you build your classpath using wildcards, you can’t even control the order your JAR files are loaded.

Managing dependencies

To avoid the issues mentioned above, you need to manage your project dependencies. It’s a common practice to implement the build process of the project using maven or ant ivy. Both tools provide dependency mangement by declaration. But at a high cost. Especially maven has received some malice lately, criticizing its steep learning curve and complexity.

Scratching the biggest itch

We decided to try a different approach to dependency management, tackling only our biggest concern: The duplication of classpath resources. We take care of the scope of a third-party library, put required JARs in the repository (to us, third party binary artifacts are part of the project source) and update manually. The one thing we cannot assure manually is that every resource is unique. Sometimes, the same class is included in different JARs, as it seems to be common practice among java web frameworks.

Ant to the rescue

Thus, I wrote a custom ant task that, given the classpath, checks for duplicate entries. If it finds one, it lists the culprits and optionally aborts the build process. Included in our continuous integration system, it gets run every time somebody performs a change. You can’t forget to delete an old version of a library or check in the same library twice without breaking the build now.

Our ClasspathCollisionCheckTask

I provide this task here, without any warranty. The source code is included in the JAR alongside the classes, if you want to know what it does exactly.

Assuming you already know how to use custom tasks within an ant build script, here’s only a short usage description.

Import the custom task:

<taskdef
    name="check.collision"
    classname="com.schneide.internal.anttask.ClasspathCollisionCheckTask"
    classpath="${customtasks.library.directory}/schneidetasks.jar"
/>

Next, use it on your classpath:

<check.collision verbose="true" failOnCollision="true">
    <path>
        <fileset dir="${classpath.library.directory}">
            <include name="**/*.jar"/>
        </fileset>
        <fileset dir="${internal.library.directory}">
            <include name="**/*.jar"/>
        </fileset>
    </path>
</check.collision>

The task scans the whole path you give it and reports any collision it detects. You will see the warnings in your build log.

If the failOnCollision parameter is set to true (optional, defaults to false), the build will abort after a collision. If you want to have debug information, set the verbose parameter to true (optional, defaults to false).

Conclusion

If you manage your project dependencies manually, you might find our custom ant task useful. If you use maven or ant ivy, you already have this functionality in your build process.

Feedback

I’m very interested in hearing your opinion on the task or about your way of handling dependencies. Leave us a comment.

A DSL for deploying grails apps

Everytime I deploy my grails app I do the same steps over and over again:

  • get the latest build from our Hudson CI
  • extract the war file from the CI archive
  • scp the war to a gateway server
  • scp the war to the target server
  • run stop.sh to shutdown the jetty
  • run update.sh to update the web app in the jetty webapps dir
  • run start.sh to start the jetty

Reading the Productive Programmer I thought: “This should be automated”. Looking at the Rails world I found a tool named Capistrano which looked like a script library for deploying Rails apps. Using builders in groovy and JSch for SSH/scp I wrote a small script to do the tedious work using a self defined DSL for deploying grails apps:

Grapes grapes = new Grapes()
def script = grapes.script {
    set gateway: "gateway-server"
    set username: "schneide"
    set password: "************"
    set project: "my_ci_project"
    set ciType: "hudson"
    set target: "deploy_target.com"
    set ci_server: "hudson-schneide"
    set files: ["webapp.war"]

    task("deploy") {
        grab from: "ci"
        scp to: "target"
        ssh "stop.sh"
        ssh "update.sh"
        ssh "start.sh"
    }
}

script.tasks.deploy.execute()

This is far from being finished but a starting point and I think about open sourcing it. What do you think: may it help you? What are your experiences with deploying grails apps?

Batteries not included

Your feature isn’t ready-to-use until you provide the necessary requirements alongside, too.

When I bought a label printing device lately, it came bundled with a label tape roll. That suggested instant usage – no need to think of additional parts upfront. Only tousb-battery-little find out that, you’ve guessed it already, batteries weren’t included. A bunch of standardized parts missing (Murphy’s law applied) and the whole ready-to-use package was rendered useless. The time and effort it took me to get the batteries was the same as to get the label tape I really wanted to use instead of the bundled one.

This is a common pattern not only with device manufacturers, but with software developers, too.

Instant feature – just add effort

Frequently, a software comes “nearly” ready-to-use. All you have to do to make it run is

  • upgrade to the latest graphics drivers
  • install some database system (we won’t tell you how as it’s not our business)
  • create some file or directory manually
  • login with administrator rights once (or worse: always) to gather write access to the registry or configuration file
  • review and change the complete configuration prior to first usage

The last point is a personal pet peeve of mine.

It all boils down to the question if a software or a feature is really ready-to-use. Most of the work you have to do manually is tedious or highly error-prone. Why not add support for this apparently crucial steps to the software in the first place?

It works instantly – with my setup

A common mistake made by developers is to forget about the history of a feature emerging in the development labs. The history includes all the little requirements (a writeable folder here, an existing database table there) that will naturally be present on the developer’s machine when she finishes work, because fulfilling them was part of the development process.

If the same developer was forced to recreate the feature on a fresh machine, she would notice all these steps with ease and probably automate or support (e.g. documentate) them, least to save herself the work of wading through it a third time.

But given that most developers regard a feature “finished”, “done” or “resolved” when the code was accepted by the repository (and hopefully the continuous integration system), the aching of the users wont reach them.

This is a case of lacking feedback.

Feel the pain – publicly

To close this open feedback loop, we established a habit of “adopting” features and bringing them to the user in person to overcome the problem of “nearly done”. If you can’t make your own feature run on the client’s machine within a few seconds, is it really that usable and “ready”? The unavoidable presence of the whole process – from the first feature request to the installed and proven-to-work software acts as a deterrent to fall for the “works on my machine” style of programming. It creates a strong relationship between the user, a feature and the developer as a side-effect.

We’ve seen quite a few junior developers experiencing a light bulb moment (and heavy sweating) in front of the customer. This is the hot-wired feedback loop working. In most cases, the situation (a feature requiring non-trivial effort to be run) will not repeat ever.

Batteries are part of the product

If your product (e.g. software) isn’t usable because some standard part (e.g. a folder) is missing, make sure you add these parts to the delivery package. It is a very pleasant experience for the user to just unwrap a software and use it right away. It shows that you’ve been cared for.

Deploying a Grails app on an Oracle DB

Running our new grails app on HSQL and a Postgresql everything went fine. But the production DB was decided to be an Oracle. And suddenly the app crashed several times. Here’s a list of what problems we encountered:

  • ORA-00972: identifier is too long
  • want to store a null value in a non null column

Oracle identifiers are limited to 30 characters. So we thought using a mapping for the table should do the trick. But grails uses the table names to construct the n:m relations and their id column names between the domain classes. Looking at the grails docs we found a joinTable mapping:

static mapping = {
    table 'PROP'
    tablePerHierarchy false
    instrumentInfos joinTable: [name:'PROP_INS', key:'id', column:'instrumentInfos_id']
}

This worked most of the time but in some cases grails just didn’t want to take our definitions. The problem was a bug in grails. The workaround we took was to shorten the domain classes names.
The second problem arose as we tried to store empty strings into the database. Oracle stores empty strings as null values which causes a constraint violation exception. The solution was to declare the string columns nullable or not nullable and not blank but you cannot use a not nullable and blank string with an Oracle DB.