August 2014 – Schneide Blog

Bit-fiddling is possible in Java

We have a service interface for Modbus devices that we can use remotely from Java. Modbus supports only very basic data types like single bits and 16-bit words. Our service interface provides the contents of a 16-bit input or holding register as a Java integer.

Often one 16-bit register is not enough to solve a problem in your domain, like representing a temperature as floating point number. A common solution is to combine two 16-bit registers and interpret their contents as a floating point number.

So the question is how to combine the bits of the two int values and convert them to a float in Java. There are at least two possiblities I want to show you.

The “old-fashioned” way includes bit-shifting and bit-wise operators which you actually can use in Java despite of the major flaw regarding primitive types: there are no unsigned data types; even byte is signed!

public static float floatFrom(int mostSignificat16Bits, int leastSignificant16Bits) {
    int bits = (mostSignificat16Bits << 16) | leastSignificant16Bits;
    return Float.intBitsToFloat(bits);
}

As seemingly more modern way is using java.nio.ByteBuffer:

public static float nioFloatFrom(int mostSignificat16Bits, int leastSignificant16Bits) {
    final ByteBuffer buf = ByteBuffer.allocate(4).putShort((short) mostSignificat16Bits).putShort((short) leastSignificant16Bits);
    buf.rewind(); // rewind to the beginning of the buffer, so it can be read!
    return buf.getFloat();
}

The second method seems superior when working with many values because you can fill the buffer conveniently in one phase and extract the float values conveniently by subsequent calls to getFloat().

Conclusion

Even if it is not one of Java’s strengths you actually can work on the bit- and byte-level and perform low level tasks. Do not let the lack of unsigned data types scare you; signedness is irrelevant for the bits in memory.

Dynamic addition and removal of collection-bound items in an HTML form with Angular.js and Rails

A common pattern in one of our web applications is the management of a list of items in a web form where the user can add, remove and edit multiple items and finally submit the data:

The basic skeleton for this type of functionality is very simple with Angular.js. We have an Angular controller with an “items” array:

angular.module('example', [])
  .controller('ItemController', ['$scope', function($scope) {
    $scope.items = [];
  }]);

And we have an HTML form bound to our Angular controller:

<form ... ng-app="example" ng-controller="ItemController"> 
  <table>
    <tr>
      <th></th>
      <th>Name</th>
      <th>Value</th>
    </tr>
    <tr ng-repeat="item in items track by $index">
      <td><span class="remove-button" ng-click="items.splice($index, 1)"></span></td>
      <td><input type="text" ng-model="item.name"></td>
      <td><input type="text" ng-model="item.value"></td>
    </tr>
    <tr>
      <td colspan="3">
        <span class="add-button" ng-click="items.push({})"></span>
      </td>
    </tr>
  </table>
  <!-- ... submit button etc. -->
</form>

The input fields for each item are placed in a table row, together with a remove button per row. At the end of the table there is an add button.

How do we connect this with a Rails model, so that existing items are filled into the form, and items are created, updated and deleted on submit?

First you have to transform the existing Ruby objects of your has-many association (in this example @foo.items) into JavaScript objects by converting them to JSON and assigning them to a variable:

<%= javascript_tag do %>
  var items = <%= escape_javascript @foo.items.to_json.html_safe %>;
<% end %>

Bring this data into your Angular controller scope by assigning it to a property of $scope:

.controller('ItemController', ['$scope', function($scope) {
  $scope.items = items;
}]);

Name the input fields according to Rails conventions and use the $index variable from the “ng-repeat” directive to provide the correct index value. You also need a hidden input field for the id, if the item already has one:

  <td>
    <input name="foo[items_attributes][$index][id]" type="hidden" ng-value="item.id" ng-if="item.id">
    <input name="foo[items_attributes][$index][name]" type="text" ng-model="item.name">
  </td>
  <td>
    <input name="foo[items_attributes][$index][value]" type="text" ng-model="item.value">
  </td>

In order for Rails to remove existing elements from a has-many association via submitted form data, a special attribute named “_destroy” must be set for each item to be removed. This only works if

accepts_nested_attributes_for :items, allow_destroy: true

is set in the Rails model class, which contains the has-many association.

We modify the click handler of the remove button to set a flag on the JavaScript object instead of removing it from the JavaScript items array:

<span class="remove-button" ng-click="item.removed = true"></span>

And we render an item only if the flag is not set by adding an “ng-if” directive:

<tr ng-repeat="item in items track by $index" ng-if="!item.removed">

At the end of the form we render hidden input fields for those items, which are flagged as removed and which already have an id:

<div style="display: none" ng-repeat="item in items track by $index"
ng-if="item.removed && item.id">
  <input type="hidden" name="foo[items_attributes][$index][id]" ng-value="item.id">
  <input type="hidden" name="foo[items_attributes][$index][_destroy]" value="1">
</div>

On submit Rails will delete those elements of the has-many association with the “_destroy” attribute set to “1”. The other elements will be either updated (if they have an id attribute set) or created (if they have no id attribute set).

How to speed up your ORM queries of n x m associations

What solution (no cache) causes a 45x speedup? Learn the different approaches and how they compare

What causes a speedup like this? (all numbers are in ms)

Disclaimer: the absolute benchmark numbers are for illustration purposes, the relationship and the speedup between the different approaches are important (just for the curious: I measured 500 entries per table in a PostgreSQL database with both Rails 4.1.0 and Grails 2.3.8 running on Java 7 on a recent MBP running OSX 10.8)

Say you have the model classes Book and (Book)Writer which are connected via a n x m table named Authorship:

A typical query would be to list all books with its authors like:

Fowler, Martin: Refactoring

A straight forward way is to query all authorships:

In Rails:

# 1500 ms
Authorship.all.map {|authorship| "#{authorship.writer.lastname}, #{authorship.writer.firstname}: #{authorship.book.title}"}

In Grails:

// 585 ms
Authorship.list().collect {"${it.writer.lastname}, ${it.writer.firstname}: ${it.book.title}"}

This is unsurprisingly not very fast. The problem with this approach is that it causes the famous n+1 select problem. The first option we have is to use eager fetching. In Rails we can use ‘includes’ or ‘joins’. ‘Includes’ loads the associated objects via additional queries, one for authorship, one for writer and one for book.

# 2300 ms
Authorship.includes(:book, :writer).all

‘Joins’ uses SQL inner joins to load the associated objects.

# 1000 ms
Authorship.joins(:book, :writer).all

# returns only the first element
Authorship.joins(:book, :writer).includes(:book, :writer).all

Additional queries with ‘includes’ in our case slows down the whole request but with joins we can more than halve our time. The combination of both directives causes Rails to return just one record and is therefore ruled out.

In Grails using ‘belongsTo’ on the associations speeds up the request considerably.

class Authorship {
    static belongsTo = [book:Book, writer:BookWriter]

    Book book
    BookWriter writer
}

// 430 ms
Authorship.list()

Also we can implement eager loading with specifying ‘lazy: false’ in our mapping which boosts a mild performance increase.

class Authorship {
    static mapping = {
        book lazy: false
        writer lazy: false
    }
}

// 416 ms
Authorship.list()

Can we do better? The normal approach is to use ‘has_many’ associations and query from one side of the n x m association. Since we use more properties from the writer we start from here.

class Writer < ActiveRecord::Base
  has_many :authors
  has_many :books, through: :authors
end

Testing the different combinations of ‘includes’ and ‘joins’ yields interesting results.

# 1525 ms
Writer.all.joins(:books)

# 2300 ms
Writer.all.includes(:books)

# 196 ms
Writer.all.joins(:books).includes(:books)

With both options our request is now faster than ever (196 ms), a speedup of 7.
What about Grails? Adding ‘hasMany’ and the authorship table as a join table is easy.

class BookWriter {
    static mapping = {
        books joinTable:[name: 'authorships', key: 'writer_id']
    }

    static hasMany = [books:Book]
}

// 313 ms, adding lazy: false results in 295 ms
BookWriter.list().collect {"${it.lastname}, ${it.firstname}: ${it.books*.title}"}

The result is rather disappointing. Only a mild speedup (2x) and even slower than Rails.

Is this the most we can get out of our queries?
Looking at the benchmark results and the detailed numbers Rails shows us hints that the query per se is not the problem anymore but the deserialization. What if we try to limit our created object graph and use a model class backed by a database view? We can create a view containing all the attributes we need even with associations to the books and writers.

create view author_views as (SELECT "authorships"."writer_id" AS writer_id, "authorships"."book_id" AS book_id, "books"."title" AS book_title, "writers"."firstname" AS writer_firstname, "writers"."lastname" AS writer_lastname FROM "authorships" INNER JOIN "books" ON "books"."id" = "authorships"."book_id" INNER JOIN "writers" ON "writers"."id" = "authorships"."writer_id")

Let’s take a look at our request time:

# 15 ms
 AuthorView.select(:writer_lastname, :writer_firstname, :book_title).all.map { |author| "#{author.writer_lastname}, #{author.writer_firstname}: #{author.book_title}" }

// 13 ms
AuthorView.list().collect {"${it.writerLastname}, ${it.writerFirstname}: ${it.bookTitle}"}

13 ms and 15 ms. This surprised me a lot. Seeing this in comparison shows how much this impacts performance of our request.

The lesson here is that sometimes the performance can be improved outside of our code and that mapping database results to objects is a costly operation.

Don’t ever not avoid negative logic

If you want to be nice to people with a challenged relationship with boolean logic, try to avoid negative formulations and negations.

I start this post with a confession: I’m not able to discern true from false. I wasn’t born with this inability, it got worse over time. The first time I knew I have this problem was in driver’s school when my teacher told me that most people cannot switch from forward to backward drive and still tell left from right. Left and right are the same to me ever since, even in forward motion. When I was taught boolean logic, my inability spread from “left and right” to “true and false” and led to funny results in some tests, especially multiple choice questions with negative statements. But my guess is that I’m not alone with this problem.

No negations

So I’m probably a little bit over-sensitive about this topic, but that should only make the point clearer: Don’t obscure your (boolean) statement with unnecessary layers of negation. See? I just did it, too. Let me rephrase: Always state your boolean logic without negation, if possible.

It’s really easy for us super-clever programmers to juggle several dozen variables in our head and evaluate any boolean statement on the fly by reading it once – regardless of parenthesis. Well, until it’s not. The thing about boolean logic is that you can’t be “unsure”. It’s only ever “true” or “false”, and just by wild guessing, you will be right about it half the time – try that with basic numerical algebra! So even if the statement looks daunting, you have a fifty-fifty chance of success.

Careful crafting

For me (and probably all people with “boolean disability”, as I call it), every boolean statement is a challenge. So you can be sure that I put maximum effort in succeeding. I write my statements carefully and with great emphasis on clarity (this blog post only covers one aspect). I re-read them several times, sometimes aloud (to my imaginary rubber duck). I thoroughly test them – most statements are factored into their own method to achieve direct testability. And I try them out before committing. Still, there is a valid chance that my boolean disability didn’t magically disappear when I wrote my unit tests and I happily asserted that the statement always has to decide the right things in the wrong way.

By painful introspection about the real nature of my boolean disability, I discovered a great easement: If a statement doesn’t flip everything on its head by negation or negative formulation, I can actually follow through most of the time. Let me rephrase for clarity: If a statement uses negation, it is hard for me to follow. And I guess everyone has a personal limit:

ow_owl

A workaround

The workaround for my boolean disability is really easy: Express the statement like it really was meant in the first place. Express it without “plot twist”. Instead of

if (!string.isEmpty())

try something like

if (string.hasContent())

Disclaimer: I know that the Java SDK (still) doesn’t provide this method. It was just an example.

A real-life example

A real-life example that caused us some troubles can be found in the otherwise excellent Greenmail plugin for Grails. In the configuration, you can set the property

greenmail.disabled = true

to disable the mail server that otherwise would start automatically. The positive formulation would be

greenmail.enabled = false

To tell the full story: The negated formulation was probably chosen to simplify the plugin’s implementation in Groovy. The side effect of this short-cut is that you can’t state

greenmail.disabled = false

and be sure that it will start the mail server. In fact, it won’t. As a developer challenged by boolean logic, this issue gave me nightmares.

The three-state trap

Using this rule as a guideline for boolean statements will also prohibit that you fall into the “three-state trap”. Imagine a Person object with the method

boolean isOlderThan(Person other)

But you want to know if a person is younger than another, so you just negate the result:

if (!personA.isOlderThan(personB))

just to be clear, following the rule of “no negations”, you would’ve written:

if (personA.isYoungerThan(personB))

which isn’t quite the same! If both persons are of equal age (the “third state”), the negated statement returns true (if I evaluated it correct!), whereas the last statement gives the correct answer (false – not younger).

Use as a guideline

Don’t get me wrong: Avoiding negations isn’t always possible or the best available option. This isn’t a law, it’s a guideline or a rule of thumb. And just because some complex boolean statement is free of negations doesn’t make it acceptable automatically. It’s just a tiny step towards pain-free boolean statements. And that’s a bad thing… NOT.

	Writing Integration… on Every Unit Test Is a Stage Pla…
	mariuselvert on C# is very strict about modify…
	Anonymous on C# is very strict about modify…
	Anonymous on Cache configuration with WildF…
	Miq on Nested queries like N+1 in pra…