Book review: “Java by Comparison”

I need to start this blog entry with a full disclosure: One of the authors of the book I’m writing about contacted me and asked if I could write a review. So I bought the book and read it. Other than that, this review is independent of the book and its authors.

Let me start this review with two types of books that I identified over the years: The first are toilet books, denoting books that can be read in small chunks that only need a few minutes each time. This makes it possible to read one chapter at each sitting and still grasp the whole thing.

The second type of books are prequel books, meaning that I wished the book would have been published before I read another book, because it paves the road to its sequel perfectly.

Prequel books

An example for a typical prequel book is “Apprenticeship Patterns” that sets out to help the “aspiring software craftsman” to reach the “journeyman” stage faster. It is a perfect preparation for the classic “The Pragmatic Programmer”, even indicated by its subtitle “From Journeyman to Master”. But the Pragmatic Programmer was published in 1999, whereas the “Apprenticeship Patterns” book wasn’t available until a decade later in 2009.

If you plan to read both books in 2019 (or onwards), read them in the prequel -> sequel order for maximized effect.

Pragmatic books

The book “The Pragmatic Programmer” was not only a groundbreaking work that affected my personal career like no other book since, it also spawned the “Pragmatic Bookshelf”, a publisher that gives authors all over the world the possibility to create software development books that try to convey practical knowledge. In software development, rapid change is inevitable, so books about practical knowledge and specific technologies have a half-life time measured in months, not years or even decades. Nevertheless, the Pragmatic Bookshelf has published at least half a dozen books that I consider timeless classics, like the challenging “Seven Languages in Seven Weeks” by Bruce A. Tate.

A prequel to Refactoring

A more recent publication from the Pragmatic Bookshelf is “Java by Comparison” by Simon Harrer, Jörg Lenhard and Linus Dietz. When I first heard about the book (before the author contacted me), I was intrigued. I categorized it as a “toilet book” with lots of short, rather independent chapters (70 of them, in fact). It fits in this category, so if you search for a book suited for brief idle times like a short commute by tram or bus, put it on your list.

But when I read the book, it dawned on me that this is a perfect prequel book. Only that the sequel was published 20 years ago (yes, you’ve read this right). In 1999, the book “Refactoring” by Martin Fowler established an understanding of “better code” that holds true until today. There was never a second edition – well, until today! Last week, the second edition of “Refactoring” became available. It caters to a younger generation of developers and replaced all Java code with JavaScript.

But what if you are an aspiring Java developer today? Your first steps in the language will be as clumsy as mine were back in 1997. For me, the first “Refactoring” was perfectly timed, because I had eased out most of my quirks and got a kickstart “from journeyman to master” out of it. But what if you are still an apprentice in Java programming? Then you should read “Java by Comparison” as the prequel book to the original “Refactoring”.

The book works by showing you actual Java code and discussing the bad and ugly parts of it. Then it proposes a better solution in actual code – something many software development books omit as an easy exercise for the reader. You will see this pattern again and again: Java code with problems, a review of the code and a revised version of the same code. Each topic is condensed into two pages, making it a perfect 5-minute read (repeated 70 times).

If you read one chapter each morning on your commute to work and another one on your way back, you’ll be sped up from apprentice level to journeyman level in less than two months. And you can apply the knowledge from each chapter in your daily code right away. Imagine you spend your commute with a friendly mentor that shows you actual code (before and after) instead of only dropping wise man’s quotes that tell you what’s wrong but never show you a specific example of “right”.

All topics and chapters in the book are thorougly researched and carefully edited. You can feel that the authors explained each improvement over and over again to their students and you might notice the little hints for further reading. They start small and slow, but speed up and don’t shy away from harder and more complex topics later in the book. You’ll learn about tests, immutability, concurrency and naming (the best part of the book in my opinion) as well as using code and API comments to your advantage and how not to express conditional logic.

Overall, the book provides the solid groundworks for good code. I don’t necessarily agree with all tips and rules, but that is to be expected. It is a collection of guidelines and rules for beginners, and a very good one. Follow these guidelines until you know them by heart, they are the widely accepted common denominator of Java programming and rightfully so. You can reflect, adapt, improve and iterate based on your experience later on. But it is important to start that journey from the “green zone” and this book will show you this green zone in and out.

My younger self would have benefited greatly had this book been around in 1997. It covers the missing gap between your first steps and your first dance in code.

It’s a beginner’s world

According to Robert C. Martin, the number of software developers worldwide doubles every five years. So my advice for the 20+ million beginners in the next five years out there is to read this book right before “Refactoring”. And reading “Refactoring” at least once is a pleasure you owe to yourself.

UX is a mindset, not an engineering task

With all those methods and measurements like A/B testing, eye tracking and so on you would believe you can engineer your way to a perfect UX but that isn’t what matters. The user and his experience matters in the end and this is delivered by the product which in turn reflects your mindset. Just like the Conway’s law which states that the architecture of your software reflects your architecture of your organization, the product’s design and user interface reflects your mindset.
But what mindset is this? Let’s take a look at my UX posts of the past.

For your last project ask yourself what did the stakeholders learn

There two lessons for UX in here: UX design is a collaborative effort and learning is really important.

How do I start a project
and
How I start a project – the next steps

UX is an iterative way to explore a problem space. It has a goal: meeting the users’ needs. And again: a collaborative one: we need a shared understanding between all the project’s participants.

Quick and dirty is a skill

Evaluating is key in UX, and for not overwhelming the effort to do so, we need to find quick and sometimes dirty solutions to test our hypotheses.

The definition of done

Meeting the spec isn’t a goal of UX, meeting the user’s needs and goals is.

Personas – the great misunderstanding

Tool is just a tool is just a tool. It can help to frame your thinking but it cannot replace your thinking.

Mapping the users workflow

Another tool which can help to connect the disconnected parts, the user stories or issues, to a whole. IN this way you see your software from the user’s perspective from his way through your interface.

What UX and sales have in common

The user is central and context is key.

Discount UX

Again: UX does not need fancy tools, the mindset is really important and you should use the tools you have: pen, paper and your brain.

From agile to UX – a change in perspective

Focus on the user and his tasks, try to formulate the requirements from the user’s perspective.

Requirements should not drive development

Jobs should. Jobs are tasks the users wants to do in a specific context. These define what the software should do when it is ready.

UX is like a text adventure

You start with a beginner’s mind, try not to assume anything.

Learning UX: where do I start

Start with listening with an open mind and think.

Assumptions how to find, track and eliminate them

Beware of your bias.

Transparent software: making complexity understandable

Complexity isn’t your enemy. Find the essential complexity that you need to reach your user’s goals.

What developers can learn from designers

Slow down, do not rush towards your goal. Software is intent. Build to learn. Focus on the whole more than the parts. Have and provide alternatives.

Bending the Java syntax until it breaks

Java is a peculiar programming language. It is used in lots of professional business cases and yet regarded as an easy language suitable for beginner studies. Java’s syntax in particular is criticized as bloated and overly strict and on the next blog praised as lenient and powerful. Well, lets have a look at a correct, runnable Java program that I like to show my students:

class $ {
	{
		System.out.println("hello world");
	}

	static {
		System.out.println("hello, too");
	}

	$() {
		http://www.softwareschneiderei.de
		while ($()) {
			break http;
		}
	}

	static boolean $() {
		return true;
	}

	public static void main(String[] _$) {
		System.out.println($.$());
	}
}

This Java code compiles, perhaps with some warnings about unlucky naming and can be run just fine. You might want to guess what its output to the console is. Notice that the code is quiet simple. No shenanigans with Java’s generics or the dark corners of lambda expressions and method handles. In fact, this code compiles and runs correctly since more than 20 years.

Lets dissect the code into its pieces. To really know what’s going on, you need to look into Java’s Language Specification, a magnificent compendium about everything that can be known about Java on the syntax level. If you want to dive even deeper, the Java Virtual Machine Specification might be your cup of tea. But be warned! Nobody on this planet understands everything in it completely. Even the much easier Java Language Specification contains chapters of pure magic, according to Joshua Bloch (you might want to watch the whole presentation, the statement in question is around the 6 minute mark). And in the code example above, we’ve used some of the magic, even if they are beginner level tricks.

What about the money?

The first glaring anomaly in the code is the strange names that are dollar signs or underscores. These are valid names, according to chapter 3.8 about Identifiers. And we’ve done great by choosing them. Let me quote the relevant sentence from this chapter:

“The “Java letters” include uppercase and lowercase ASCII Latin letters A-Z (\u0041-\u005a), and a-z (\u0061-\u007a), and, for historical reasons, the ASCII dollar sign ($, or \u0024) and underscore (_, or \u005f). The dollar sign should be used […]”

Oh, and by the way: Java identifiers are of unlimited length. You could go and write valid Java code that never terminates. We’ve gone the other way and made our names as short as possible – one character. Since identifiers are used as class names, method names, variable names and (implicitly) constructor names, we can name them all alike.

The variable name of the arguments in the main method used to be just an underscore, but somebody at Oracle changed this section of the Language Specification and added the following sentence:

“The underscore may be used in identifiers formed of two or more characters, but it cannot be used as a one-character identifier due to being a keyword.”

This change happened in Java 9. You can rename the variable “_$” to just “_” in Java 8 and it will work (with a warning).

URLs as first-class citizens?

The next thing that probably caught your eye is the URL in the first line of the constructor. It just stands there. And as I told you, the code compiles. This line is actually a combination of two things: A labeled statement and a comment. You already know about end-of-line comments that are started with a double slash. The rather unknown thing is the labeled statement before it, ending with a colon. This is one of the darker regions of the Language Specification, because it essentially introduces a poor man’s goto statement. And they knew it, because they explicitly talk about it:

“Unlike C and C++, the Java programming language has no goto statement; identifier statement labels are used with break…”

And this explains the weird line in the while loop: “break http” doesn’t command Java to do harm to the computer’s Internet connection, but to leave and complete the labeled statement, in our case the while loop. This spares us from the looming infinite loop, but raises another question: What names are allowed as labels? You’ve guessed it, it’s a Java identifier. We could have named our label “$:” instead of “http:” and chuckled about the line “break $”.

So, Java has a goto statement, but it isn’t called as such and it’s crippled to the point of being useless in practice. In my 20+ years of Java programming, I’ve seen it used just once in the wild.

What about it all?

This example of Java code serves me as a reminder that a programming language is what we make out of it. Our Java programs could easily all look like this if we wanted to. It’s not Java’s merit that our code is readable. And it’s not Java’s fault that we write bloated code. Both are results of our choices as programmers, not an inevitableness from the language we program in.

I sometimes venture to the darker regions of programming languages to see what the language could look and feel like if somebody makes the wrong decisions. This code example is from one of those little journeys several years ago. And it proved its worth once again when I tried to compile it with Java 9. Remember the addition in the Language Specification that made the single underscore a keyword? It wasn’t random. Java’s authors want to improve the lambda expressions in Project Amber, specifically in the JEP 302 (Lambda Leftovers). This JDK Enhancement Proposal (JEP) was planned for Java 10, is still not included in Java 11 and has no clear release date yet. My code gave me the motivation to dig into the topic and made me watch presentations like the one from Brian Goetz at Devoxx 2017 that’s really interesting and a bit unsettling.

Bending things until they break is one way to learn about their limits. What are your ways to learn about programming languages? Do you always stay in the middle lane? Leave a comment on your journeys.

Transforming C-Style arrays in java

Every now and then some customer asks us to fix or improve some important legacy application other people have written. Usually, such projects are fun and it is rewarding to see the improvements both in code and value for the users.

In one of these projects there is a Java GUI application that uses C-style arrays for some of its central data structures:

public class LegoBox  {
  public LegoBrick[] bricks = new LegoBrick[8000];
  public int brickCount = 0;
}

The array-length is a constant upper bound and does not denote the actual elements in the array. Elements are added dynamically to the array and it looks like a typical job for a automatically growing Collection like java.util.ArrayList. Most operations simply iterate over all elements and perform some calculations. But changing such a central part in a performance sensitive application is not only a lot of work but also risky.

We decided to take an incremental approach to improve code readability and maintainability and measured performance with a large, representative dataset between refactorings. There are two easy alternative APIs that improve working with the above data structure.

Imperative API

Smooth migration from the existing imperative “ask”-code (see “Tell, don’t ask”-principle) can be realized by providing an java.util.Iterable to the underlying array.


public int countRedBricks() {
  int redBrickCount = 0;
  for (int i = 0; i < box.brickCount; i++) {
    if (box.bricks[i].isRed()) {
      redBrickCount++;
    }
  }
  return redBrickCount;
}

Code like above is easily transformed to much clearer code like below:

public class LegoBox  {
  public LegoBrick[] bricks = new LegoBrick[8000];
  public int brickCount = 0;

  public Iterable<LegoBrick> allBricks() {
    return Arrays.stream(tr, 0, brickCount).collect(Collectors.toList());
  }
}

public int countRedBricks() {
  int redBrickCount = 0;
  for (LegoBrick brick : box.bricks) {
    if (brick.isRed()) {
      redBrickCount++;
    }
  }
  return redBrickCount;
}

Functional API

A nice alternative to the imperative solution above is a functional interface to the array. In Java 8 and newer we can provide that easily and encapsulate the iteration over our array:

public class LegoBox  {
  public LegoBrick[] bricks = new LegoBrick[8000];
  public int brickCount = 0;

  public <R> R forAllBricks(Function<Brick, R> operation, R identity, BinaryOperator<R> reducer) {
    return Arrays.stream(bricks, 0, brickCount).map(operation).reduce(identity, reducer);
  }

  public void forAllBricks(Consumer<LegoBrick> operation) {
    Arrays.stream(bricks, 0, brickCount).forEach(operation);
  }
}

public int countRedBricks() {
  return box.forAllBricks(brick -> brick.isRed() ? 1 : 0, 0, (sum, current) -> sum + current);
}

The functional methods can be tailored to your specific needs, of course. I just provided two examples for possible functional interfaces and their implementation.

The function + reducer case is a very general interface and used here for an implementation of our “count the red bricks” use case. Alternatively you could implement this use case with a more specific but easier to use filter + count interface:

public class LegoBox  {
  public LegoBrick[] bricks = new LegoBrick[8000];
  public int brickCount = 0;

  public long countBricks(Predicate<Brick> filter) {
    return Arrays.stream(bricks, 0, brickCount).filter(operation).count();
  }
}

public int countRedBricks() {
  return box.countBricks(brick -> brick.isRed());
}

The consumer case is very simple and found a lot in this specific project because mutation of the array elements is a typical operation and all over the place.

The functional API avoids duplicating the iteration all the time and removes the need to access the array or iterable/collection. It is therefore much more in the spirit of “tell”.

Conclusion

The new interfaces allow for much simpler and maintainable client code and remove a lot of duplicated iterations on the client side. They can be introduced on the way when implementing requested features for the customer.

That way we invested only minimal effort in cleaner, better maintainable and more error-proof code. When someday all accesses to the public array are encapsulated we can use the new found freedom to internalize the array and change it to a better fitting data structure like an ArrayList.

JavaScript for Java developers (revised, partly)

Almost 5 years ago I wrote a piece about the specialities of the JavaScript language for developers knowing Java. A lot has happened since then. The old (EcmaScript Standard Version 5) way is still working but some of the rough edges has been eased out.

I want to concentrate on two areas: (variable) declaration and their scope and object/class creation.

Declaration

Now JavaScript has new ways to declare variables. The old var still works and declares a variable with a function scope:

function f() {
  var a = 2;
  var b = 1;
  if (a > b) {
    var a = 5;
    alert(a); // 5
  }
  alert(a); // 5
}

But since ES6 (also known as ES 2015) you can use let to declare a variable with block scope.

function f() {
  let a = 2;
  let b = 1;
  if (a > b) {
    let a = 5;
    alert(a); // 5!
  }
  alert(a); // 2!
}

You can also use const to create a constant, but must assign it in the same line.

  const i = 5;
  i = 3; // TypeError: Assignment to constant variable
  const b; // SyntaxError: Missing initializer in const declaration

It is not the same as final which you can declare and initialize in different lines:

final int i = 5;
i = 3; // error!
final b; // that's ok
b = 3;
b = 4; // error

Also beware that const declares a constant, not necessarily an immutable object:

  const a = [5, 3];
  a[0] = 3; // ok!

Object creation

Now this is the part where the JavaScript syntax changed a lot. The old functional way is still working but now you can declare a class in a more Java-ish way:

class Person {
  constructor(name) {
    this.name = name;
  }
}

You can also use a var:

var Person = class {
  constructor(name) {
    this.name = name;
  }
};

Methods can be declared as well:

class Person {
  constructor(firstName, lastName) {
    this.firstName = firstName;
    this.lastName = lastName;
  }

  fullName() { // getter!
    return this.firstName + ' ' + this.lastName;
  }
}

var p = new Person('John', ''Doe);
alert(p.fullName());

You can also use property getters to sugarcode the access code:

class Person {
  constructor(firstName, lastName) {
    this.firstName = firstName;
    this.lastName = lastName;
  }

  get fullName() { // getter!
    return this.firstName + ' ' + this.lastName;
  }
}
var p = new Person('John', ''Doe);
alert(p.fullName); // <-- just called like a property, not a method

Static methods are also streamlined:

class Factory {
  static antiqueStyleNames(firstName, birthplace) {
    return new Person(firstName, 'of ' + birthplace);
  }
}

Inheritance, although still prototypical, can be done with extends:

class A extends B {
  constructor(a) {
    super();
  }

  m() {
    super.n();
  }
}

JavaScript only supports single inheritance but mixins are now possible:

var mixin = Base => class extends Base {
  a() {return 1; }
};

class B {}

class A extends mixin(B) {}

alert(new A().a()); // 1!

There are many more things in modern JavaScript like arrow functions, spread and rest operators and many more. JavaScript is evolving (Java also) so even if you are mainly located in the Javaland, it pays off to take a look at JavaScript from time to time.

Learning about Class Literals after twenty years of Java

I’ve programmed in Java nearly every day for twenty years now. At the beginning of my computer science studies, I was introduced to Java 1.0.x and have since accompanied every version of Java. Our professor made us buy the Java Language Specification on paper (it was quite a large book even back then) and I occassionally read it like you would read an encyclopedia – wading through a lot of already known facts just to discover something surprising and interesting, no matter how small.

With the end of my studies came the end of random research in old books – new books had to be read and understood. It was no longer efficient enough to randomly spend time with something, everything needed to have a clear goal, an outcome that improved my current position. This made me very efficient and quite effective, but I only uncover surprising facts and finds now if work forces me to go there.

An odd customer request

Recently, my work required me to re-visit an old acquaintance in the Java API that I’ve never grew fond of: The Runtime.exec() method. One of my customer had an recurring hardware problem that could only be solved by rebooting the machine. My software could already detect the symptoms of the problem and notify the operator, but the next logical step was to enable the software to perform the reboot on its own. The customer was very aware of the risks for such a functionality – I consider it a “sabotage feature”, but asked for it anyway. Because the software is written in Java, the reboot should be written in Java, too. And because the target machines are exclusively running on Windows, it was a viable option to implement the feature for that specific platform. Which brings me to Runtime.exec().

A simple solution for the reboot functionality in Java on Windows looks like this:


Runtime.exec("shutdown /r");

With this solution, the user is informed of the imminent reboot and has some time to make a decision. In my case, the reboot needed to be performed as fast as possible to minimize the loss of sensor data. So the reboot command needs to be expanded by a parameter:


Runtime.exec("shutdown /r /t 0");

And this is when the command stops working and politely tells you that you messed up the command line by printing the usage information. Which, of course, you can only see if you drain the output stream of the Process instance that performs the command in the background:


final Process process = Runtime.exec("shutdown /r /t 0");
try (final Scanner output = new Scanner(process.getInputStream())) {
    while (output.hasNextLine()) {
        System.out.println(output.nextLine());
    }
}

The output draining is good practice anyway, because the Process will just stop once the buffer is filled up. Which you will never see in development, but definitely in production – in the middle of the night on a weekend when you are on vacaction.

Modern thinking

In Java 5 and improved in Java 7, the Runtime.exec() method got less attractive by the introduction of the ProcessBuilder, a class that improves the experience of creating a correct command line and a lot of other things. So let’s switch to the ProcessBuilder:


final ProcessBuilder builder = new ProcessBuilder(
        "shutdown",
        "/r",
        "/t 0");
final Process process = builder.start();

Didn’t change a thing. The shutdown command still informs us that we don’t have our command line under control. And that’s true: The whole API is notorious of not telling me what is really going on in the background. The ProcessBuilder could be nice and offer a method that returns a String as it is issued to the operating system, but all we got is the ProcessBuilder.command() method that gives us the same command line parts we gave it. The mystery begins with our call of ProcessBuilder.start(), because it delegates to a class called ProcessImpl, and more specific to the static method ProcessImpl.start().

In this method, Java calls the private constructor of ProcessImpl, that performs a lot of black magic on our command line parts and ultimately disappears in a native method called create() with the actual command line (called cmdstr) as the first parameter. That’s the information I was looking for! In newer Java versions (starting with Java 7), the cmdstr is built in a private static method of ProcessImpl: ProcessImpl.createCommandLine(). If I could write a test program that calls this method directly, I would be able to see the actual command line by myself.

Disclaimer: I’m not an advocate of light-hearted use of the reflection API of Java. But for one-off programs, it’s a very powerful tool that gets the job done.

So let’s write the code to retrieve our actual command line directly from the ProcessImpl.createCommandLine() method:


public static void main(final String[] args) throws Exception {
    final String[] cmd = {
            "shutdown.exe",
            "/r",
            "/t 0",
    };
    final String executablePath = new File(cmd[0]).getPath();

    final Class<?> impl = ClassLoader.getSystemClassLoader().loadClass("java.lang.ProcessImpl");
    final Method myMethod = impl.getDeclaredMethod(
            "createCommandLine",
            new Class[] {
                    ????, // <-- Damn, I don't have any clue what should go here.
                    String.class,
                    String[].class
            });
    myMethod.setAccessible(true);

    final Object result = myMethod.invoke(
            null,
            2,
            executablePath,
            cmd);
    System.out.println(result);
}

The discovery

You probably noticed the “????” entry in the code above. That’s the discovery part I want to tell you about. This is when I met Class Literals in the Java Language Specification in chapter 15.8.2 (go and look it up!). The signature of the createCommandLine method is:


private static String createCommandLine(
        int verificationType,
        final String executablePath,
        final String cmd[])

Note: I didn’t remove the final keyword of verificationType, it isn’t there in the original code for unknown reasons.
When I wrote the reflection code above, it occurred to me that I had never attempted to lookup a method that contains a primitive parameter – the int in this case. I didn’t think much about it and went with Integer.class, but that didn’t work. And then, my discovery started:


final Method myMethod = impl.getDeclaredMethod(
        "createCommandLine",
        new Class[] {
                int.class, // <-- Look what I can do!
                String.class,
                String[].class
        });

As stated in the Java Language Specification, every primitive type of Java conceptionally “has” a public static field named “class” that contains the Class object for this primitive. We can even type void.class and gain access to the Class object of void. This is clearly written in the language specification and required knowledge for every earnest usage of Java’s reflection capabilities, but I somehow evaded it for twenty years.

I love when moments like this happen. I always feel dumb and enlightened at the same time and assume that everybody around me knew this fact for years, it is just me that didn’t get the memo.

The solution

Oh, and before I forget it, the solution to the reboot command not working is the odd way in which Java adds quote characters to the command line. The output above is:


shutdown /r "/t 0"

The extra quotes around /t 0 make the shutdown command reject all parameters and print the usage text instead. A working, if not necessarily intuitive solution is to separate the /t parameter and its value in order to never have spaces in the parameters – this is what provokes Java to try to help you by quoting the whole parameter (and is considered a feature rather than a bug):


final String[] cmd = {
        "shutdown",
        "/r",
        "/t",
        "0",
};

This results in the command line I wanted from the start:


shutdown /r /t 0

And reboots the computer instantaneous. Try it!

Your story?

What’s your “damn, I must’ve missed the memo” moment in the programming language you know best?

The Great Rational Explosion

A Dream to good to be true

A few years back I was doing mostly computational geometry for a while. In that field, floating point errors are often of great concern. Some algorithms will simply crash or fail when it’s not taken into account. Back then, the idea of doing all the required math using rationals seemed very alluring.
For the uninitiated: a good rational type based on two integers, a numerator and a denominator allows you to perform the basic math operations of addition, subtraction, multiplication and division without any loss of precision. Doing all the math without any loss of precision, without fuzzy comparisons, without imperfection.
Alas, I didn’t have a good rational type available at the time, so the thought remained in the realm of ideas.

A Dream come true?

Fast forward a couple of years to just two months ago. We were starting a new project and set ourselves the requirement of not introducing floating point errors. Naturally, I immediately thought of using rationals. That project is written in java and already using jscience, which happens to have a nice Rational type. I expected the whole thing to be a bit slower than math using build-in types. But not like this.
It seemed like a part that was averaging about 2000 “count rate” rationals was extremely slow. It seemed to take about 13 seconds, which we thought was way too much. Curiously, the problem never appeared when the count rate was zero. Knowing a little about the internal workings of rational, I quickly suspected the summation to be the culprit. But the code was doing a few other things to, so naturally my colleagues demanded proof that that was indeed the problem. Hence I wrote a small demo application to benchmark the problem.

The code that I measured was this:

Rational sum = Rational.ZERO;
for (final Rational each : list) {
    sum = sum.plus(each);
}
return sum;

Of course I needed some test data, that I generated like this:

final List<Rational> list = new ArrayList<>();
for (int i=0; i<2000; ++i) {
    list.add(Rational.valueOf(i, 100));
}
return list;

This took about 10ms. Bad, but not 13s catastrophic.

Now from using rational numbers in school, we remember that summing up numbers with equal denominators is actually quite easy. You just leave the denominator as is and add the two numerators. But what if the denominators are different? We need to find a common multiple of the two denominators before we can add. Usually we want the smallest such number, which is called the lowest common multiple (lcm). This is so that the numbers don’t just explode, i.e. get larger and larger with each addition. The algorithm to find this is to just multiply the two numbers and divide by their greatest common divisor (gcd). Whenever I held the debugger during my performance problems, I’d see the thread in a function called gcd. The standard algorithm to determine the gcd is the Euclidean Algorithm. I’m not sure if jscience uses it, but I suspect it does. Either way, it successively reduces the problem via a division to a smaller instance.

What does this all mean?

This means that much of the complexity involved happens only when there’s variation in the denominator. Looking at my actual data, I saw that this was the case for our problem. The numbers were actually close to one, but with the numerator and the denominator each close to about 4 million. This happened because the counts that we based this data on where “normalized” by a time value that was close, but not equal to one. So let’s try another input sequence:

final Random randomGenerator = new Random();
final List<Rational> list = new ArrayList<>();
for (int i=0; i<2000; ++i) {
    list.add(Rational.valueOf(4000000, 4000000 + randomGenerator.nextInt(2000)));
}
return list;

That already takes 10 seconds. Wow. Here’s the rational number it produced:

10925412090387826443356493315570970692092187751160448231723307006165619476894539076955047153673774291312083131288405937659569868481689058438177131164590263653191192222748118270107450103646129518975243330010671567374030224485257527751326361244902048868408897125206116588469496776748796898821150386049548562755608855373511995533133184126260366718312701383461146530915166140229600694518624868861494033238710785848962470254060147575168902699542775933072824986578365798070786548027487694989976141979982648898126987958081197552914965070718213744773755869969060335112209538431594197330895961595759802183116845875558763156976148877035872955887060171489805289872602037240200456259054999832744341644730504414071499368916837445036074038038173507351044919790626190261603929855676606292397018669058312742095714513746321779160111694908027299104638873374887446030780299702571350702255334922413606738293755688345624599836921568658488773148103958376515598663301740183540590772327963247869780883754669368812549202207109506869618137936835948373483789482539362351437914427056800252076700923528652746231774096814984445889899312297224641143778818898785578577803614153163690077765243456672395185549445788345311588933624794815847867376081561699024148931189645066379838249345071569138582032485393376417849961802417752153599079098811674679320452369506913889063163196412025628880049939111987749980405089109506513898205693912239150357818383975619592689319025227977609104339564104111365559856023347326907967378614602690952506049069808017773270860885025279401943711778677651095917727518548067748519579391709794743138675921116461404265591335091759686389002112580445715713768865942326646771624461518371508718346301286775279265940739820780922411618115665915206028180761758701198283575402598963356532479352810604578392844754856057089349811569436655814012237637615544417676166890247526742765145909088354349593431829508073735508662766171346365854920116894738553593715805698326801840647472004571022201012455368883190600587502030947401749733901881425019359516340993849314997522931836068574283213181677667615770392454157899894789963788314779707393082602321025304730355204512687710695657016587562258289968709342507303760359107314805479150337790244385189611378805094282650120553138575380568150214510972734241803176908917697662914714188030879994734853772797322420241420911735874903926141598416992690859929943631826094723456317312589265104334870907579391696178556354299428366394819280011410287891113591176612795009226826412471238783334239148961082442565804292473501012401378940718084589859443350905260282342990350362981901637062679381912861429756544396701574099199222399937752826106312708211791773562169940745686837853342547182813438086856565980815543626740277913678365142830117575847966404149038892476111835346566933160119385992791677587359063277202990220629004309670865867774206252830200897207368966439730136012024728717701204793182480513620275549665094200202565592742030772102704751736850897665353297536494739059325582661212315355306787427752670613324951121097833683795311514392922347268374097451268196257308005629903372871471809591087849716533132440301432155867780938535327925645340832637372702171777123816397448399703780105396941226655424025197472384099218081468916864256472238808237005121132164363385877692234230678011184351921814453560033879491735351402997266882544304106997065987376103362395437737475217181551336569975031721614790499945872209261769951117223344186839969922893394319287462384028859822057955389124951467203432571737865201780344423642467187208636881135573636815083891626138564337634176587161231028307776960866522346008589607259041199676560090157817882260300414906572885890188984036234226505815367029839231023461597364977306399898603903392434756572392816540125771578189640871020070756539777101197151773304409519870643142190955018579914630314940373832858007535828153361236115553577694543503842444481599944319287815162136101362705211937257383677282014480487759786222801447548899760241829116959865698836386442016721709983097509675552750221989521551169512674725876581185837611167980363615880958917421251873901289922888492447507837290336628975165062036681599909052030653421736716061426079882106810502703095803882805916960831442634085856041781093664688754713907512226706324967656091109936101526173370212867073380662492009726657437921033740063367290862521594119329592938626114166263957511012256023777676569002181500977475083845756500926631153405264250959378833667206532373995888322137324027620266863005721216133252342921697663864807284554205674829658250755046340838031118227643145562001361542532622713886266813492926885236832665609571019479812713355021295737820773552735161701716018010606040731647943600206193923458996150345093898644748170519757957470535978378479854546255651200511536560142431948781377187548218601919108870420102025378751015728281345799655926856602543107729659372984539588835599345223921737022220676709028150797109091782506736145801340069563865839397272145141831011878720095142353543406658905222847479419799336972983678887227301846161770296173667855239714987183774931791306562351516152727523242208973614372214119191610954209164193665209038399568256186789865817560942946289864468805486432823029457193832616134050058472575/5464071367689021145920376789564097484075057036929154325361292811121525852598101325663975897893163034606703637232625465056851133763148192531586105963101428155647567490545647427502233159652310898001125892675829093187577533366895512701590739143174316498475076791171065480136546725008720643752948237443019242161669077663609144434306128221503414531394096823915215979623953337930493605186601571769495144894636986997575880131117237208510857818613219593393080298986414277944012186244301930294333213815805316754678940597177696108355782072853392533822105101007621252067159549391148948251225551745385134586065334994558336331772298542454065874623247283672363609435360051294428160464673413148011783297358303182389731822629550376618544475198010869325105825675331154332311829064320240772190873197445422806128724029723364642376469088090535867746031708415054086042362900835830071439066729248803080864979591431807735444694059982355674331452510436222691302473693502522151731546119758216043039918795122874474779117841250524168339597048231340441394931963429195108468364711206679388129543777587115734312812004999951805288552516345754609724336541283126925634054615621607387407977448299658305191245949844785795188192329587881827885708672142850022336110188979606923986247270824545988992712993364088907981140326036171066083995899655482025878782401213185589085533281898090985040136079388970812804900747496605542168328539571899508434943602485570225905907644120282314027462195035830426406664327228058856521800556952281791943493734276488880843046612364784975328644388569123725347247071532065625240062618476483177429385912626131806870421725932016173249912942914992853735756415323674217279068679145528740177762103447741638280753518805769821177202801618302946904761889981101239141113424330262556585926281757788049928339119700279137278143586150134536970549327847433827708642630282821305576285238510965599311560018270085653975154329160951814845203648945184348538132528313719965960517692615234724933839415863834082163365616979319651204331888160659534780756175312928937580725581177783203569550625979825708835192412739747467314045215372710237518236109496682610777837094944201522879675742349874447002866525687741559327659019423524782141050034563799707395461801978917300210616390761873065433809737256362778865972956861423012424858347791919074000867019988914246185988825680768363450694954357708832975385526022445613099402197649574980990392753660371969220370462447697883319180076033708570759874351259644698980354866068784429511482751711329624930863629304545443308117342612677534958727764219861540475648492451312505519760969766401341016292089564143516344584197883268917604454917829927706134205489288603918892893943815239131020894621190795863599245502858307692284626604530452680973144088318876770254624083735638792267234407991769552946582638360068875509975769002847311018204789400549401560376453753063628076240879327047662612561558831502017122298425734701863181656884664851026481979959368508273072349808416302081774935491159815938813047108294310775621375475681754627108394755080076400344634873409823122370888409229666662703837096634538759103222727447751640114800586463155863216558367921497862783396136480309101772689723377535772396361300263399246855706948606216922101060308941294672554277063185792769730719177469014803853068307966008447841184403910224016078072279882935920347299431580783664128991812654780457701135185591782941754395263461459180940761928488860853515685525983743168631933069310771834569879795160576216776499263300025539821575767674288253419696114910586812284230709422230558006531335540918025940397918186306383352682903150594714754889631368556569576312855831657674738769087018659290558967429709446878600756119596020561987088170551193295467295484037656684414413021143993266548616775075905905925334335921152629477697210074629622902141063078657746149232034056642352265941452947845339051610301151130368316362022636711499114826671695540416257477345744126416363795225938900727969092692270788643742709421442552812735921534532156766228688341300473140115855278302077136595142902548557943030527129378779996403263450119984758775332533036452077654603781557033976641416371186050927154435643492983848051087282942559503555223474754690693742997370592039106889187781827888290578576160100274561900208861136548215743616634615083974725330617020874519606610812580339985836584711397102753448160446984238925180319975660479580098047098930179253806342651446549086560273369490291496196263969077175524002077318170194252067818804991902015489300563029607689935067689448990111925784018672159126316019672778424331051001652240822592173809393561688768265941132515679297537471384753992855568256114701369550476003237008840508060572457453896791033992685192069703595482787771374051189250065726225137532026227150630791972777126392521002221370419505711150022179216266239656298115575018443700688208844387102582445812545858014862427316785193110884751319467750425511273629581416588296942433219322216784262821066075700266485645060935996743388485096744598169509920624971167075214788838073469621687619355816573640360338756385397428237445511332615921133308459043700086925442114337299109227541364012352140312577797531234832237279430947774637533319353713938562360646441591033255036

I kid you not, that’s over 10000 digits! In the editor I’m writing this in, that’s roughly 3 pages. No wonder it took that long. Let’s use even more variation in the numbers:

final Random randomGenerator = new Random();
final List<Rational> list = new ArrayList<>();
for (int i=0; i<2000; ++i) {
    list.add(Rational.valueOf(4000000 + randomGenerator.nextInt(5000),
            4000000 + randomGenerator.nextInt(20000)));
}

Now that already takes 16 s, with about 14000 digits. Oh boy. Now the maximum number of values I expected to do this averaging for was about 4000, so let’s scale that up:

final Random randomGenerator = new Random();
final List<Rational> list = new ArrayList<>();
for (int i=0; i<4000; ++i) {
    list.add(Rational.valueOf(4000000 + randomGenerator.nextInt(5000),
            4000000 + randomGenerator.nextInt(20000)));
}
return list;

That took 77 seconds! More than 4 times as long as for half the amount of data. The resulting number has over 26000 digits. Obviously, this scales way worse than linear.

An Explanation

By now it was pretty clear what was happening: The ever so slightly not-1 values were causing an “explosion” in the denominator after all. When two denominators are coprime, i.e. their greatest common divisor is 1, the length of the denominators just adds up. The effect also happens when the gcd is very small, such as 2 or 3. This can happen quite a lot with huge numbers in a sufficiently large range. So when things go bad for your input data, the length of the denominator just keeps growing linearly with the number of input values, making each successive addition slower and slower. Your rationals just exploded.

Conclusion

After this, it became apparent that using rationals was not a great idea after all. You should be very careful when doing series of additions with them. Ironically, we were throwing away all the precision anyways before presenting the number to a user. There’s no way for anyone to grok a 26000 digit number anyways, especially if the result is basically 4000.xx. I learned my lesson and buried the dream of perfect arithmetic. I’m now using fixed point arithmetic instead.