Unexpected RESTEasy application upgrade surprise

The setting

A few months ago we got to maintain a RESTEasy application running in a Wildfly 10 container. The application uses RESTEasy as both, server and client and contains a few custom interceptors and providers.

Now our client wants to move on to Wildfly 13 as deployment target. Most of the application works out-of-the-box or just by upgrading some dependencies in the new container but some critical parts like the REST client requests stopped working.

The investigation

After some digging through the error messages it became clear our interceptors and providers were not called anymore. What has changed? Wildfly 13 comes with RESTEasy 3.5.1 while we were using 3.0 in Wildfly 10. Looking at the upgrade documentation leaves us puzzled though:

RESTEasy 3.5 series is a spin-off of the old RESTEasy 3.0 series, featuring JAX-RS 2.1 implementation.

The reason why 3.5 comes from 3.0 instead of the 3.1 / 4.0 development streams is basically providing users with a selection of RESTEasy 4 critical / strategic new features, while ensuring full backward compatiblity. As a consequence, no major issues are expected when upgrading RESTEasy from 3.0.x to 3.5.x.

We are using the standard classpath scanning method which discovers annotated RESTEasy classes and registers them for the application. Trying to register them explicitly in the application yielded the message, that our providers are already registered:

RESTEASY002155: Provider class mypackage.MyProvider is already registered. 2nd registration is being ignored.

Scanning and registration seemed to just work alright. So what was happening here?

The resolution

After a bit more investigation we realized the issue was on the client side only! In Wildfly 10/RESTEasy 3.0 the providers were automatically registered for the client, too. This is not the case anymore in Wildfly 13/RESTEasy 3.5! You have to register them with the client either using the ResteasyClientBuilder or the ResteasyClient you are using like mentioned in the documentation:

Client client = new ResteasyClientBuilder() // Activate gzip compression on client:
                    .register(AcceptEncodingGZIPFilter.class)
                    .register(GZIPDecodingInterceptor.class)
                    .register(GZIPEncodingInterceptor.class)
                    .build();

This subtle change in (undocumented?) behaviour took several hours to debug. Nevertheless, we actually like the change because we prefer doing things explicitly instead of using some magic. So now it is clear what interceptors and providers our REST client is using.

Bringing your Grails app from 2.4 to 3.3

Updating to a new framework version often needs a lot of work and investigation how to fix problems that may arise. Usually there are upgrade guides that take you most of the way and make upgrading only a grind.

This also true for Grails and our upgrade experience with it. Often there are parts where you have to invest extra work and creativity. The current upgrade of our application from 2.4.5 to 3.3.8 is no exception:

The grind

The major changes and upgrade notes are part of the documentation so I will only mention them briefly:

  • Switch to the gradle build system
  • Using YAML as main configuration
  • Migration from filters to interceptors
  • New testing framework (partly optional because you can still use the old mixin framework with a plugin)
  • Package name changes
  • Former core features are now available as plugins like gsps, datasource and GORM
  • Functional tests need to use Spock+Geb or you will face weird problems and need to do extra work (we had selenium tests using selenium-server before)
  • Integration tests work differently so work needs to be done to migrate them
  • Logging using Lockback
  • Entities often need a @Entity annotation
  • Move some files to new Locations

The tricky stuff

  • A service named CounterService conflicts with spring boot autowiring so we had to rename it
  • Our TagLib tests using JUnit4 were failing with obscure errors, porting them to Spock fixed them.
  • We have so many dependencies that running the application with gradle:bootRun fails with: Createprocess error=206; the filename or extension is too long Fortunately adding grails { pathingJar = true } to build.gradle fixes the issue
  • Environment variables for gradle:bootRun are swallowed if not prefixed with “grails.”. We are using environment variables to customize running the application on the dev machines.

 

The hard parts

The most painful part was two central plugins we are using not being available anymore: shiro and searchable.

Shiro

For shiro there are some initial ports that work well for our needs, so the challenge was mostly finding the most fitting one of the forks on github. We went with the fork of Alin Pandichi and forked it ourselves to upgrade some version definitions.

Searchable becomes ElasticSearch

The real odyssee began looking for a replacement of the abandoned searchable plugin. Fortunately there is the compelling ElasticSearch-plugin which uses almost the same API as the searchable plugin:

The plugin focus on exposing Grails domain classes for the moment. It highly takes the existing Searchable Plugin as reference for its syntax and behaviour.

Unfortunately, we were unable to get it to work with our project trying many different versions, so we decided to fork and fix it for us. The main problems were:

  • Essentially, it does not work properly with hibernate as a data store because it chokes on the JavaAssist proxies hibernate often creates for domain objects.
  • An easy to fix concurrency issue
  • Not flexible enough converters

After a lot of debugging and a couple of fixes and the new feature of being able to use a spring bean as a converter we had search working smoothly and better than ever.

Wrapping it all up

The upgrade of our application to the newest incarnation of Grails was a rocky ride and took us quite some time.

On the other hand the framework got a lot better. Especially gradle is much better to manage than the previous build system.

So we are looking forward to a much better and robust development experience in the future and hope for some less revolutionary releases and easier upgrades.

The sorry state of Grails (Plugins)

We have been developing and maintaining a complex web application on Grails since summer of 2008. By then Grails had passed the 1.0 release milestone and was really hot. A good 10 years later the application is still in use and we are trying to upgrade from Grails 2.4 to 3.3.

Upgrading Grails – a rough ride

Similar to past upgrade experiences the ride is not very smooth. Besides the major changes like the much welcomed switch to the gradle build system, interceptors instead of filters and streamlined configuration there are again a host of more subtle changes. The biggest problem for us though is the plugin situation.

It’s the plugins

In the past we had tough breaks like the abandoned selenium plugin in favor of the much better geb for functional testing. That had cost us a lot of work and many lost and not yet rewritten functional tests.

This time it seems especially hard because you two of our central plugins are not readily available anymore:

  1. Apache Shiro Plugin
  2. Compass-based Searchable Plugin

1. Shiro authentication

There still is no official release of the shiro plugin for Grails 3.x. After some searching and researching the initial port on github we decided to fork and maintain the most current forked version ourselves and try to work with it. Fortunately it was relatively easy to integrate and to update some dependencies. Our authentication and authorization works at least as good as before and we do not face additional problems. Working with interceptors feels quite good, too.

2. Search

The situation is harder with search. Compass and the searchable plugin are dead – plain and simple. The replacement for grails is the elasticsearch plugin which mostly adopted the API of the searchable plugin. Getting it to work is not that easy though. You have different versions depending on the grails 3 version you are targetting. Each plugin version targets a specific elasticsearch server version and so on. Often times (like in the default configuration) you will need a matching mapper-attachment plugin that is not available on maven in newer versions. This is mentioned somewhere in the midst of the plugin documentation.

Furthermore the plugin itself has some problems with hibernate proxies and concurrency so here we have to mess around with the plugin code once more. Once we have everything working for us like before we will try to get our patches upstream.

Marching forward

The upgrade from 2.x to 3.x is the biggest (and best) step of Grails into the right direction. On the downside it places a lot of burden on the application and plugin developers. That again increases the cost of maintaining proven applications further.

Right now we are close to a Grails 3.3 version of our application but have invested considerable effort into this upgrade.

Our current recommendation and practice is to not start new web applications based on the grails framework because there have been too many breaking changes and the maintainance cost is high. But we are keeping a close look at grails because the increased modularization and and new options like the grails-react-profile may keep grails interesting in the future.

Using PostgreSQL for time-series data

The number of sensors and other things that periodically collect data is ever growing. This advent of the  internet of things (IoT) demands a way of storing and analyzing all this so-called time-series data. There are many options for such data – the most prominent being special time-series databases like InfluxDB or well suited, nicely scaling databases like Apache Cassandra.

The problem is you have to tailor your solution to one of these technologies whereas there is SQL with mature database management systems (DBMS) and drivers/bindings for almost any programming language.

Why not use a plain SQL database?

Relational SQL databases are a mature and well understood piece of technology albeit not as sexy as all those new NoSQL databases. Using them for time series data may not be a problem for smaller datasets but sooner or later your ingestion and query performance will degrade massivly. So in general it is not a good option to store all your time-series data in a traditional relational DBMS (RDBMS).

Why use PostgreSQL with TimescaleDB?

With the PostgreSQL extension TimescaleDB you get the best of both worlds: a well known query language, robust tools and scalability.

You access and manage your time-series database just like your ordinary PostgreSQL database. Almost everything including replication and backups will continue to work like before.

You do not have to deal with limitations of specialized solutions or learn a completely new ecosystem just for one aspect of your solution.

The future

We are successfully using TimescaleDB in one of our projects and will continue to share tipps and experience with this technology taking its rising importance into account.

 

Using Ansible vault for sensitive data

We like using ansible for our automation because it has minimum requirements for the target machines and all around infrastructure. You need nothing more than ssh and python with some libraries. In contrast to alternatives like puppet and chef you do not need special server and client programs running all the time and communicating with each other.

The problem

When setting up remote machines and deploying software systems for your customers you will often have to use sensitive data like private keys, passwords and maybe machine or account names. On the one hand you want to put your automation scripts and their data under version control and use them from your continuous integration infrastructure. On the other hand you do not want to spread the secrets of your customers all around your infrastructure and definately never ever in your source code repository.

The solution

Ansible supports encrypting sensitive data and using them in playbooks with the concept of vaults and the accompanying commands. Setting it up requires some work but then usage is straight forward and works seamlessly.

The high-level conversion process is the following:

  1. create a directory for the data to substitute on a host or group basis
  2. extract all sensitive variables into vars.yml
  3. copy vars.yml to vault.yml
  4. prefix variables in vault.yml with vault_
  5. use vault variables in vars.yml

Then you can encrypt vault.yml using the ansible-vault command providing a password.

All you have to do subsequently is to provide the vault password along with your usual playbook commands. Decryption for playbook execution is done transparently on-the-fly for you, so you do not need to care about decryption and encryption of your vault unless you need to update the data in there.

The step-by-step guide

Suppose we want work on a target machine run by your customer but providing you access via ssh. You do not want to store your ssh user name and password in your repository but want to be able to run the automation scripts unattended, e.g. from a jenkins job. Let us call the target machine ceres.

So first you setup the directory structure by creating a directory for the target machine called $ansible_script_root$/host_vars/ceres.

To log into the machine we need two sensitive variables: ansible_user and ansible_ssh_pass. We put them into a file called $ansible_script_root$/host_vars/ceres/vars.yml:

ansible_user: our_customer_ssh_account
ansible_ssh_pass: our_target_machine_pwd

Then we copy vars.yml to vault.yml and prefix the variables with vault_ resulting in $ansible_script_root$/host_vars/ceres/vault.yml with content of:

vault_ansible_user: our_customer_ssh_account
vault_ansible_ssh_pass: our_target_machine_pwd

Now we use these new variables in our vars.xml like this:

ansible_user: "{{ vault_ansible_user }}"
ansible_ssh_pass: "{{ vault_ansible_ssh_pass }}"

Now it is time to encrypt the vault using the command

ANSIBLE_VAULT_PASS="ourpwd" ansible-vault encrypt host_vars/ceres/vault.yml

resulting a encrypted vault that can be put in source control. It looks something like

$ANSIBLE_VAULT;1.1;AES256
35323233613539343135363737353931636263653063666535643766326566623461636166343963
3834323363633837373437626532366166366338653963320a663732633361323264316339356435
33633861316565653461666230386663323536616535363639383666613431663765643639383666
3739356261353566650a383035656266303135656233343437373835313639613865636436343865
63353631313766633535646263613564333965343163343434343530626361663430613264336130
63383862316361363237373039663131363231616338646365316236336362376566376236323339
30376166623739643261306363643962353534376232663631663033323163386135326463656530
33316561376363303339383365333235353931623837356362393961356433313739653232326638
3036

Using your playbook looks similar to before, you just need to provide the vault password using one of several options like specifying a password file, environment variable or interactive input. In our example we just use the environment variable inline:

ANSIBLE_VAULT_PASS="ourpwd" ansible-playbook -i inventory work-on-customer-machines.yml

After setting up your environment appropriately with a password file and the ANSIBLE_VAULT_PASSWORD_FILE environment variable your playbook commands are exactly the same like without using a vault.

Conclusion

The ansible vault feature allows you to safely store and use sensitive data in your infrastructure without changing too much using your automation scripts.

For what the javascript!

The setting

We are developing and maintaining an important web application for one of our clients. Our application scrapes a web page and embeds our own content into that page frame.

One day our client told us of an additional block of elements at the bottom of each page. The block had a heading “Image Credits” and a broken image link strangely labeled “inArray”. We did not change anything on our side and the new blocks were not part of the HTML code of the pages.

Ok, so some new Javascript code must be the source of these strange elements on our pages.

The investigation

I started the investigation using the development tools of the browser (using F12). A search for the string “Image Credits” instantly brought me to the right place: A Javascript function called on document.ready(). The code was basically getting all images with a copyright attribute and put the findings in an array with the text as the key and the image url as the value. Then it would iterate over the array and add the copyright information at the bottom of each page.

But wait! Our array was empty and we had no images with copyright attributes. Still the block would be put out. I verified all this using the debugger in the browser and was a bit puzzled at first, especially by the strange name “inArray” that sounded more like code than some copyright information.

The cause

Then I looked at the iteration and it struck me like lightning: The code used for (name in copyrightArray) to iterate over the elements. Sounds correct, but it is not! Now we have to elaborate a bit, especially for all you folks without a special degree in Javascript coding:

In Javascript there is no distinct notion of associative arrays but you can access all enumerable properties of an object using square brackets (taken from Mozillas Javascript docs):

var string1 = "";
var object1 = {a: 1, b: 2, c: 3};

for (var property1 in object1) {
  string1 = string1 + object1[property1];
}

console.log(string1);
// expected output: "123"

In the case of an array the indices are “just enumerable properties with integer names and are otherwise identical to general object properties“.

So in our case we had an array object with a length of 0 and a property called inArray. Where did that come from? Digging further revealed that one of our third-party libraries added a function to the array prototype like so:

Array.prototype.inArray = function (value) {
  var i;
  for (i = 0; i < this.length; i++) {
    if (this[i] === value) {
      return true;
    }
  }
  return false;
};

The solution

Usually you would iterate over an array using the integer index (old school) or better using the more modern and readable for…of (which also works on other iterable types). In this case that does not work because we do not use integer indices but string properties. So you have to use Object.keys().forEach() or check with hasOwnProperty() if in your for…in loop if the property is inherited or not to avoid getting unwanted properties of prototypes.

The takeaways

Iteration in Javascript is hard! See this lengthy discussion…The different constructs are named quite similar an all have subtle differences in behaviour. In addition, libraries can mess with the objects you think you know. So finally some advice from me:

  • Arrays are only true arrays with positive integer indices/property names!
  • Do not mess with the prototypes of well known objects, like our third-party library did…
  • Use for…of to iterate over true arrays or other iterables
  • Do not use associative arrays if other options are available. If you do, make sure to check if the properties are own properties and enumerable.

 

Transforming C-Style arrays in java

Every now and then some customer asks us to fix or improve some important legacy application other people have written. Usually, such projects are fun and it is rewarding to see the improvements both in code and value for the users.

In one of these projects there is a Java GUI application that uses C-style arrays for some of its central data structures:

public class LegoBox  {
  public LegoBrick[] bricks = new LegoBrick[8000];
  public int brickCount = 0;
}

The array-length is a constant upper bound and does not denote the actual elements in the array. Elements are added dynamically to the array and it looks like a typical job for a automatically growing Collection like java.util.ArrayList. Most operations simply iterate over all elements and perform some calculations. But changing such a central part in a performance sensitive application is not only a lot of work but also risky.

We decided to take an incremental approach to improve code readability and maintainability and measured performance with a large, representative dataset between refactorings. There are two easy alternative APIs that improve working with the above data structure.

Imperative API

Smooth migration from the existing imperative “ask”-code (see “Tell, don’t ask”-principle) can be realized by providing an java.util.Iterable to the underlying array.


public int countRedBricks() {
  int redBrickCount = 0;
  for (int i = 0; i < box.brickCount; i++) {
    if (box.bricks[i].isRed()) {
      redBrickCount++;
    }
  }
  return redBrickCount;
}

Code like above is easily transformed to much clearer code like below:

public class LegoBox  {
  public LegoBrick[] bricks = new LegoBrick[8000];
  public int brickCount = 0;

  public Iterable<LegoBrick> allBricks() {
    return Arrays.stream(tr, 0, brickCount).collect(Collectors.toList());
  }
}

public int countRedBricks() {
  int redBrickCount = 0;
  for (LegoBrick brick : box.bricks) {
    if (brick.isRed()) {
      redBrickCount++;
    }
  }
  return redBrickCount;
}

Functional API

A nice alternative to the imperative solution above is a functional interface to the array. In Java 8 and newer we can provide that easily and encapsulate the iteration over our array:

public class LegoBox  {
  public LegoBrick[] bricks = new LegoBrick[8000];
  public int brickCount = 0;

  public <R> R forAllBricks(Function<Brick, R> operation, R identity, BinaryOperator<R> reducer) {
    return Arrays.stream(bricks, 0, brickCount).map(operation).reduce(identity, reducer);
  }

  public void forAllBricks(Consumer<LegoBrick> operation) {
    Arrays.stream(bricks, 0, brickCount).forEach(operation);
  }
}

public int countRedBricks() {
  return box.forAllBricks(brick -> brick.isRed() ? 1 : 0, 0, (sum, current) -> sum + current);
}

The functional methods can be tailored to your specific needs, of course. I just provided two examples for possible functional interfaces and their implementation.

The function + reducer case is a very general interface and used here for an implementation of our “count the red bricks” use case. Alternatively you could implement this use case with a more specific but easier to use filter + count interface:

public class LegoBox  {
  public LegoBrick[] bricks = new LegoBrick[8000];
  public int brickCount = 0;

  public long countBricks(Predicate<Brick> filter) {
    return Arrays.stream(bricks, 0, brickCount).filter(operation).count();
  }
}

public int countRedBricks() {
  return box.countBricks(brick -> brick.isRed());
}

The consumer case is very simple and found a lot in this specific project because mutation of the array elements is a typical operation and all over the place.

The functional API avoids duplicating the iteration all the time and removes the need to access the array or iterable/collection. It is therefore much more in the spirit of “tell”.

Conclusion

The new interfaces allow for much simpler and maintainable client code and remove a lot of duplicated iterations on the client side. They can be introduced on the way when implementing requested features for the customer.

That way we invested only minimal effort in cleaner, better maintainable and more error-proof code. When someday all accesses to the public array are encapsulated we can use the new found freedom to internalize the array and change it to a better fitting data structure like an ArrayList.