Grails upgrade – lessons learned

Grails is a great framework for rapidly developing web applications on top the proven java/servlet platform. Especially smaller, short-lived projects can be a real breeze with all the scaffolding, GORM and convention-over-configuration built into grails.

We happen to use it for a quite complex web application project for almost 3 years now. Half a year ago we upgrade from grails 1.0.x to 1.3.4. That makes 3 major versions in one upgrade step and produced obviously a lot of work and many small bugs. I do not want to put the blame on the grails guys here, because most of the stuff was mentioned in the release notes and it was a big step we decided to take when the decision came to continue the project for several years to come.

Our upgrade policy changed due to that experience and we try to stay a lot more current to be able to adapt our software to framework changes more incrementally. Some weeks ago we upgraded from 1.3.4 to 1.3.7 and this experience was not pleasant. Even though we skimmed through the changelogs and release notes and thought the update should be uncritical for us grails behaviour changed in two aspects which broke things for us:

  1. An API-change where GroovyPagesException was changed to GrailsTagException
  2. Behavioural change where no application context and injections are available in functional tests anymore

Item 1 was easy to fix but you need really good testing to spot it before it slips into production. Such subtle API changes should not happen in micro-version updates as that can easily break parts of the system whithout you knowing because of groovy/grails’ dynamic nature. No compiler saves you here.

Item 2 produced some amount of work for us because we build a quite extensive acceptance test suite using services and domain objects to setup the initial environment for each test. Luckily, there is the grails remote control plugin which you can use for things like that.

Lessons learned

  • You should have extensive automated test suites when developing a grails application over a longer period because things can break in unexpected ways without code changes on your side.
  • Try to plan upgrades some time ahead of releases and dedicate time to scanning the release notes and actually performing the upgrade. It may take you significantly longer than the smooth upgrade procedure itself suggests.

The grails team seems to be increasingly aware of backwards compatibility but they still have some way to go. We hope and expect to see fewer unexpected breakages to occur in the future.

Acceptance testing a grails app with selenium-rc

Having extensive acceptance tests is the basis of delivering high quality releases with very few regressions for long time projects. This is even more true when your environment uses dynamically typed languages and changing requirements. One of our Grails projects is running for several years now and continues to evolve and grow.  We are in dire need of more acceptance tests and especially their automated execution. Manual testing is not feasible and our coverage through unit and integration tests is not enough. We have a nice set of Selenium IDE acceptance tests already though. They were executed very infrequently which let some bugs slip through into production.

I want to describe our approach to automated and extensive acceptance testing below:

  1. We create the acceptance tests using capture & replay with selenium IDE. This is a fast way to exercise a new feature through a repeatable test.
  2. We think that maintaining the tests in code offers much more flexiblity and is easier to run in continuous integration (CI) than maintaining the selenium IDE html code. So we export the captured test to play nicely with the grails selenium-rc plugin. Kurt Harriger explains the setup and usage of the selenium-rc grails plugin. You need to make some changes to the exported code for everything to work nicely:
    1. Delete or change package declaration.
    2. Choose a grails functional test compatible file name like MyAcceptanceTests.groovy. We use the Junit4 export but Groovy export works also because the difference is only marginal and both must be adjusted in some places.
    3. Change the class name to match the file name without extension if they are not equal already.
    4. Change the exported test to extends from GroovyTestCase instead of *SeleneseTestCase.
    5. Add the @Mixin(SeleniumAware) annotation to the test class.
    6. Remove the setup and teardown methods.
    7. Replace verifyTrue() and friends with junit assertions.
  3. Each test has to setup it’s initial state. This leads to independent acceptance tests at the expense of some longer running time but is well worth the cost imho.
  4. The resulting selenium-rc tests can be run easily using grails test-app -functional and thus integration in the build process is pretty straighforward. We currently use ant to wrap the grails calls, but other ways may be more feasible depending on your infrastructure.

The end result is fast creation of acceptance tests and much flexibility setting up the test fixture and maintaining the tests. Using the grails plugin you gain easy execution of the tests on the developer machines and CI servers as well. With extensive automated acceptance tests the danger of regressions is greatly reduced. But be sure to not neglect unit and integration tests!

Is Groovy++ already doomed?

<disclaimer>I really like Groovy and other cool languages like Scala, Fantom, Gosu or Clojure targetting the JVM.</disclaimer>

I know the title is a bit provocative but I want to express a concern regarding Groovy++. In my perception most people think of Groovy++ as an extension for Groovy which trades dynamic dispatching for static typing and dispatching yielding performance. So you can just look for hot spots in your code and resolve them with some annotations. Sounds nice, doesn’t it?.

That seems to be the promise of Groovy++ but it isn’t. Alex Tkachman, the founder of the Groovy++ project states this clearly in this comment to an issue with Groovy++: “100% compatibility with regular Groovy is nice when possible and we do our best to keep it but it is not a must.”.

Imho the mentioned issue together with this statement reduces the target audience to a few people who think of Groovy++ as a better Java, not a faster and type-safe Groovy where needed. I do not think there are too many people thinking that way. I think wide adoption of such a Groovy++ will not happen given the alternatives mentioned in the disclaimer above and Groovy itself. I hope they will strive for 100% compatibility with Groovy…

Diving into Hibernate’s Query Cache behaviour

Hibernate is a very sophisticated OR-Mapper and as such has some overhead for certain usage patterns or raw queries. Through proper usage of caches (hibernates featured a L1, L2 cache and a query cache) you can get both performance and convenience if everything fits together. When trying to get more of our persistence layer we performed some tests with the query cache to be able to decide if it is worth using for us. We were puzzled by the behaviour in our test case: Despite everything configured properly we never had any cache hits into our query cache using the following query-sequence:

  1. Transaction start
  2. Execute query
  3. Update a table touched by query
  4. Execute query
  5. Execute query
  6. Transaction end

We would expect that step 5 would be a cache hit but in our case it was not. So we dived into the source of the used hibernate release (the 3.3.1 bundled with grails 1.3.5) and browsed the hibernate issue tracker. We found the issue and correlated it to the issues HHH-3339 and HHH-5210. Since the fix was simpler than upgrading grails to a new hibernate release we fixed the issue and replaced the jar in our environment. So far, so good, but in our test step 5 still refused to produce a cache hit. Using the debugger strangely enough provided us a cache hit when analyzing the state of the cache and everything. After some more brooding and some println()'s and sleep()‘s we found the reason for the observed behaviour in the UpdateTimestampsCache (yes, yet another cache!):

	public synchronized void preinvalidate(Serializable[] spaces) throws CacheException {
		//TODO: to handle concurrent writes correctly, this should return a Lock to the client
		Long ts = new Long( region.nextTimestamp() + region.getTimeout() );
		for ( int i=0; i
			if ( log.isDebugEnabled() ) {
				log.debug( "Pre-invalidating space [" + spaces[i] + "]" );
			}
			//put() has nowait semantics, is this really appropriate?
			//note that it needs to be async replication, never local or sync
			region.put( spaces[i], ts );
		}
		//TODO: return new Lock(ts);
	}

The innocently looking statement region.nextTimestamp() + region.getTimeout() essentially means that the query cache for a certain “region” (e.g. a table in simple cases) is “invalid” (read: disabled) for some “timeout” period or until the end of the transaction. This period defaults to 60 seconds (yes, one minute!) and renders the query cache useless within a transaction. For many use cases this may not be a problem but our write heavy application really suffers because it works on very few different tables and thus query caching has no effect. We are still discussing ways to leverage hibernates caches to improve the performance of our app.

Statement against public fields in Java

Every once in a while I talk to people about coding style and sooner or later there is discussion about public fields and getters/setters in Java. I would like to elaborate my opinion on this issue in addition to other quite well balanced articles to a broader audience.

First I want to differentiate properties of a class from other fields/member variables. Properties are fields, whose values are useful and important to clients of the class. We consciously decide to break encapsulation here and provide this data to our clients. The size of a collection may serve as a nice example. Fields on the other hand store state or dependencies our class needs to be fully operational. Datastructures like arrays, data access objects (DAOs) or some kind of notification service may serve as examples here.

The internal implementation of both, properties and fields, should never be exposed because this truely breaks encapsulation and takes away the freedom of the class implementor to change their implementation. At a later time you may decide to compute a value or read it from a database instead of storing it directly . On the other hand properties themselves may well be public and belong to the API of our class.

Now on to Java. There is no native property support in the Java language as it does not support the uniform access principle using language constructs. In other languages like Python, Ruby, Groovy or Scala you can change from direct field access to accessor methods without changing the clients, so it is no problem to expose fields (or more precisely properties) and thus make them public or protected. To gain the same degree of freedom in Java you have to emulate properties by using the getter/setter convention of Java Beans. You have to trade conciseness of public fields against this freedom and you really should do it. An IDE can generate the accessors and fold the methods away from your sight. The cost of getters/setters is really negligible.

Now we can derive the conclusions for Java programers. With each member variable you introduce you have to decide if it is a property or just some internal field. For properties you may provide getters and/or setters with appropriate visibility when needed. That means you should not provide accessor methods for all of your fields. In general you should never expose fields directly and all instance variables should be private. Not doing so will remove the freedom to change class internals without affecting the clients. Once a class with exposed internals is published as part of an API it is almost impossible to change internal design decisions.

Scala: Easier to read (and write), harder to understand?

There is a vivid discussion about Scala’s complexity going on for some weeks now on the web even with a response from Martin Odersky. I want to throw my 2¢ together with some hopefully new aspects into the discussion.
Scala’s liberal syntax rules and compiler magic like type inference and implicit conversions allow nicely written APIs and DSLs almost looking like prose texts. Take a look at APIs like scalatest and imagine the Java/Junit equivalent:

@Test def demonstrateScalaTest() {
  val sb = new StringBuilder("Scala")
  sb.append(" is cool!")
  sb.toString should be ("Scala is cool!")
  evaluating { "concise".charAt(-1) } should produce [StringIndexOutOfBoundsException]
}

There are really nice features that reduce day-to-day programming tasks to keywords or one-liners. Here are some examples:

// singletons have their own keyword (object), static does not exist!
object MySingleton {
  def printMessage {
    println("I am the only one")
 }
}

// lazy initialization/evaluation
lazy val complexResult = computeForHours()

// bean-style data container with two scala properties and one java-bean property with getter+setter
class Data(val readOnly: String, var readWrite: Int, @BeanProperty var javaProperty: String)

// tuples as return values or quick data transfer objects (DTO) for methods yielding multiple data objects
def correctCoords(x: Double, y: Double) = (x + 12, y * 0.5)
val (correctedX, correctedY) = correctCoords(0.37, 34.2)
println("corrected: " + correctedX + ", " + correctedY)

On the other hand there are so many features built-in that really make it hard to understand the code if you are not scala programmer with some experience. I like the differentiation between application and library code Martin Odersky himself makes in Programming Scala. The frameworks I have tried so far (Lift, scalatest and scala-swing) in Scala make your life very easy as long as you just use them. It is really a breeze and much more fun than using most APIs in Java for example. But when something goes wrong or you really want/have to understand what is going on you can have a hard time. This is true at least for a Scala beginner, sometimes perhaps for an pro, too.

Final Thoughts
In my opinion Scala is a very nice language that successfully combines clean object oriented programming with functional features. You can migrate from a pure OO-style to a nice hybrid “Scala-style” like many programmers did when they first used Java mostly with procedural style using classes only as namespaces for their static methods. I am quite sure that a Scala code style and best practices still have to develop. Programmers will need their time diving into the language and using it for their benefit. I hope Scala prospers and gains attention in the industry because I personally think it is a nice step forward compared to Java (which turns more and more into a mess where you need profound knowledge to fight your problems).

Regarding the complexity, which certainly exists in Scala, I only want to raise some questions which may be answered sometime in the future:

  • Maybe the tooling is just not there (yet)?
  • Maybe you sometimes just don’t have to understand everything what’s happening underneath?
  • Maybe Scala makes debugging much more seldom but harder, when something does not work out?
  • Maybe the features and power of Scala are worth learning?
  • Maybe certain features will just be banned by the teams like sometimes in Java teams (think of switch-case, the ?-operator, Autoboxing e.g.)?

Grails: Migrating enum mapping from 1.0 to 1.2 or newer

We have an ongoing long-term web application running on Grails. It started with Grails 1.0.3 and Grails moved on to 1.3.3 in the meantime. Due to time constraints and lack of resources we were not able to update to each new major version. Now, some years later, the time has finally come for us to benefit from all the new features, bugfixes and improvements of the platform. There were quite some changes in behaviour and one of the biggest is the change of how enums are mapped to the database.

In Grails 1.0.x and 1.1.x Java enums were mapped as int values in the database. Starting with Grails 1.2 they are mapped as varchars containing the enum name. Now you have the problem to migrate your existing data over to the new mapping style of the framework. One solution is to use autobase or liquibase migrations to port the enum values for the new mapping style.
Suppose we have the enum Coolness:

pulic enum Coolness {
    COOL, UNCOOL;
}

The SQL for migrating it on the PostgreSQL DBMS looks as simple like this:

alter table foo alter column bar type varchar
    using case when bar = 0 then 'COOL'
            when bar = 1 then 'UNCOOL'
            end

and as an autobase migration it becomes something like this:

changeSet(id: 'change foo bar from int to varchar', author: 'me') {
    preConditions(onFail: "CONTINUE") {
        dbms(type: "postgresql")
    }
    sql("""alter table foo alter column bar type varchar
            using case when bar = 0 then 'COOL'
                    when bar = 1 then 'UNCOOL'
                    end""")
}

We use the precondition to skip our non-persistent in-memory databases we use in development and only apply the change set with persistent test or production databases.

For Oracle and maybe other database systems which do not support altering the column type with the using-clause it may look like

alter table foo add bar_new varchar(255);
update foo set bar_new =
    (case when type = 0 then 'COOL'
            when type = 1 then 'UNCOOL'
      end);
alter table foo drop column bar;
alter table foo rename column bar_new to bar;

I hope this helps when you have to perfom similar migrations some time.

Stay tuned for other changes in API and behaviour between the different Grails framework versions.

An Oracle story: Null, empty or what?

One big argument for relational databases is SQL which as a standard minimizes the effort needed to switch your app between different DBMSes. This comes particularily handy when using in-memory databases (like HSQL or H2) for development and a “big” one (like PostgreSQL, MySQL, DB2, MS SqlServer or Oracle) in production. The pity is that there are subtle differences with regard to the interpretation of the SQL-standard when it comes to databases from different vendors.

Oracle is particularily picky and offers quite some interesting behaviours: Most databases (all that I know well) treat null and empty as different values when it comes to strings. So it is perfectly valid to store an empty string in a not-null column and retrieving the string from the column yields an empty string. Not so with Oracle 10g! Inserting null and retrieving the value yields unsurprisingly null, even using Oracle. Inserting an empty string and retrieving the value leaves you with null, too! Oracle does not differentiate between empty strings and null values like a Java developer would expect. In our environment this has led to surprised developers and locally unreproducible bug which clearly exist in production a couple of times.

[rant]Oracle has great features for big installations and enterprises that can afford the support, maintenance and hardware of a serious Oracle DBMS installation. But IMHO it is a shame that such a big player in the market does not really care about the shortcomings of their flagship product and standards in general (Oracle 10g only supports SQL92 entry level!). Oracle, please fix such issues and help us developers to get rid of special casing for your database product![/rant]

The lesson to be learnt here is that you need a clone of the production database for your integration tests or acceptance tests to be really effective. Quite some bugs have slipped into production because of subtle differences in behaviour of our inhouse databases and the ones in production at the customer site.

Looking left and right will improve you as a developer

After initial encounters with computers and programming I pretty much settled with Java as my preferred Language and especially platform. Occasional adventures to C/C++ or other languages and tools do happen but not on a day to day basis. We are mostly a Java shop so this seems natural. However, I strongly suggest looking left and right and trying new stuff, be it a programming language, an operating system, an IDE or a programming framework. Similarily to travelling around in the real world™ it will widen your views on your daily work and give you new ideas on how to improve it. You will try to adapt new good stuff from elsewhere and on the other hand appreciate nice aspects of your current environment more.

Let me give some examples to support my point and motivate adventures outside your home turf. I have been playing around with Scala and therefore functional programming in the past months. One major benefit of these experiments was my new appreciation for immutable types and side effect free code. You can carry them over to many programming environments including Java often making your code easier to test and less error prone. Object-oriented programming (OOP) relies heavily on objects with state and side effects but there are many places where immutability reduces tracking effort of objects and their state. A nice example in Java is the Joda Time library with provides immutable DateTime-classes in contrast to java.util.Date et al. Null-Handling in Scala using the Option-Type seems so interesting that some people try to carry it over to Java as an alternative to Null-Objects or null checks. The rich collection classes and implicit conversions in Scala may encourage you write own utility classes for Java collections, nice wrappers or use alternative collection frameworks to make your life easier. In general, I find wrapping a nice, standard OO-technique often underused outside of frameworks.

You do not always have to stray that far. Some frameworks like Fest and EasyMock show nice usages of fluent interfaces. Why not use this technique in own code? I found them especially useful for implementing builders for complex or highly configurable objects. Fluent interfaces can make your code look a lot like natural language resulting in expressive and highly readably code.

Using Mac OS X with Spotlight™ and TimeMachine™ may make you look for similiar features on your Linux Box (e.g. gDeskbar and BackInTime) or Windows (depending on version available through third party software or built-in). Using the multiple deskops of Linux (or another OS with an X11-Window system) may motivate you to try them in your other working environments (they suck on OS X, though).

Trying different IDEs may increase your effectiveness depending on the language and frameworks used. Grails support may be better in IntelliJ whereas you may like Eclipse more for plain J2SE projects. Sometimes you will find some cool feature in one IDE you were missing before. I encourage you to go back to your environment and look for the same or similar features. Often you will find them in many advanced IDEs.

Without seeing what is possible you may never miss it. But beware thinking everything new is automatically better. The grass is always greener on the other side. Try to reflect on the things you learn and encounter. Take the good parts to improve existing stuff whenever sensible. But also do not fear to move on if it is time.

For me and my colleagues this wandering around in this rich software development world has proven very valuable to continually improve our style and increase our productivity. These adventures in foreign waters coupled with reading books, dev brunches and attending talks on and offline keeps our skills fresh and improving. Sometimes it may even lead to own ideas, APIs and tools.

Step-by-step tutorials and manuals are priceless

Writing documentation in and outside of code annoys most developers. I am no different but after several years in the software development business I have come to value one type of documentation a lot: step-by-step tutorials and manuals. Surprisingly, they are often missing. I want to depict some common use cases where such documents have proven valueable many times:

when learning new tools and frameworks

Most people I know learn best by example. A tutorial with step-by-step explanation of how to setup the tool and then some good examples brings you up to speed quickly. The Lift framework has such a thing and gets you a webapp running in minutes. But when you dive deeper you may find much documentation and how-tos are missing. Books may help but most of the time it is faster and easier to look for examples on the web and in project wikis.

Also beware, that bad examples may cause a lot of frustration (typos are poison for newbies) and teach bad style or obsolete techniques. Keep the tutorials up-to-date (I know, thats one problem of documentation….) and as accurate as possible and many people will have a better time working with the tools and frameworks.

The same is true for developers in your team: Many will look for examples in the existing code base and learn from them be they good or bad. So keep the code clean and full of good examples. Let your seniors spread them in the teams. Do not forget to provide good examples and best practices in your wiki or other documentation system.

when performing manual tasks

We often have to manage and deploy production systems for our customers. Even though many tasks are automated by scripts and other tools you sometimes need to perform manual tasks. Especially in an stress situation at customer site or when otherwise dealing with production systems it is extremely helpful to have a simple and precise guide for the task at hand. There are enough unknowns and problems that may need brain power so the basic tasks and procedures should be well documented. That way you have one less thing to worry about and are ready to face potential upcoming problems. A good guide gives you much needed confidence.

when trying to work on an open source project

We regularily (see our OSLDs) dive into some open source project to improve it. Nothing is more frustrating than spending more than half a day with setting up the development environment. Overboarding dependencies, works-on-my-machine-style build scripts and missing documentation and tutorials prevent a productive experience. That hurts the themselves projects by driving away potential contributors. We had several such experiences but also many positive ones like Hudson plugin development or EGit where you can get up to work in minutes and perform your first monkey see – monkey do experiments.

Conclusion

Much documentation generated or written nowadays is not worth 2 cents. API-Docs which list the classes, methods and fields and no additional info and descriptions provide no use and just waste the time of people trying to find information there. Highlevel blabla or theoretical dissertations are all nice but do not help much getting the job done (at least not in the beginning). But small guides and tutorials written to the point really do make a difference regardless if they are written for inhouse work and development or for openly available projects. Choose the kind of documentation you write wisely.