Book review: “Java by Comparison”

I need to start this blog entry with a full disclosure: One of the authors of the book I’m writing about contacted me and asked if I could write a review. So I bought the book and read it. Other than that, this review is independent of the book and its authors.

Let me start this review with two types of books that I identified over the years: The first are toilet books, denoting books that can be read in small chunks that only need a few minutes each time. This makes it possible to read one chapter at each sitting and still grasp the whole thing.

The second type of books are prequel books, meaning that I wished the book would have been published before I read another book, because it paves the road to its sequel perfectly.

Prequel books

An example for a typical prequel book is “Apprenticeship Patterns” that sets out to help the “aspiring software craftsman” to reach the “journeyman” stage faster. It is a perfect preparation for the classic “The Pragmatic Programmer”, even indicated by its subtitle “From Journeyman to Master”. But the Pragmatic Programmer was published in 1999, whereas the “Apprenticeship Patterns” book wasn’t available until a decade later in 2009.

If you plan to read both books in 2019 (or onwards), read them in the prequel -> sequel order for maximized effect.

Pragmatic books

The book “The Pragmatic Programmer” was not only a groundbreaking work that affected my personal career like no other book since, it also spawned the “Pragmatic Bookshelf”, a publisher that gives authors all over the world the possibility to create software development books that try to convey practical knowledge. In software development, rapid change is inevitable, so books about practical knowledge and specific technologies have a half-life time measured in months, not years or even decades. Nevertheless, the Pragmatic Bookshelf has published at least half a dozen books that I consider timeless classics, like the challenging “Seven Languages in Seven Weeks” by Bruce A. Tate.

A prequel to Refactoring

A more recent publication from the Pragmatic Bookshelf is “Java by Comparison” by Simon Harrer, Jörg Lenhard and Linus Dietz. When I first heard about the book (before the author contacted me), I was intrigued. I categorized it as a “toilet book” with lots of short, rather independent chapters (70 of them, in fact). It fits in this category, so if you search for a book suited for brief idle times like a short commute by tram or bus, put it on your list.

But when I read the book, it dawned on me that this is a perfect prequel book. Only that the sequel was published 20 years ago (yes, you’ve read this right). In 1999, the book “Refactoring” by Martin Fowler established an understanding of “better code” that holds true until today. There was never a second edition – well, until today! Last week, the second edition of “Refactoring” became available. It caters to a younger generation of developers and replaced all Java code with JavaScript.

But what if you are an aspiring Java developer today? Your first steps in the language will be as clumsy as mine were back in 1997. For me, the first “Refactoring” was perfectly timed, because I had eased out most of my quirks and got a kickstart “from journeyman to master” out of it. But what if you are still an apprentice in Java programming? Then you should read “Java by Comparison” as the prequel book to the original “Refactoring”.

The book works by showing you actual Java code and discussing the bad and ugly parts of it. Then it proposes a better solution in actual code – something many software development books omit as an easy exercise for the reader. You will see this pattern again and again: Java code with problems, a review of the code and a revised version of the same code. Each topic is condensed into two pages, making it a perfect 5-minute read (repeated 70 times).

If you read one chapter each morning on your commute to work and another one on your way back, you’ll be sped up from apprentice level to journeyman level in less than two months. And you can apply the knowledge from each chapter in your daily code right away. Imagine you spend your commute with a friendly mentor that shows you actual code (before and after) instead of only dropping wise man’s quotes that tell you what’s wrong but never show you a specific example of “right”.

All topics and chapters in the book are thorougly researched and carefully edited. You can feel that the authors explained each improvement over and over again to their students and you might notice the little hints for further reading. They start small and slow, but speed up and don’t shy away from harder and more complex topics later in the book. You’ll learn about tests, immutability, concurrency and naming (the best part of the book in my opinion) as well as using code and API comments to your advantage and how not to express conditional logic.

Overall, the book provides the solid groundworks for good code. I don’t necessarily agree with all tips and rules, but that is to be expected. It is a collection of guidelines and rules for beginners, and a very good one. Follow these guidelines until you know them by heart, they are the widely accepted common denominator of Java programming and rightfully so. You can reflect, adapt, improve and iterate based on your experience later on. But it is important to start that journey from the “green zone” and this book will show you this green zone in and out.

My younger self would have benefited greatly had this book been around in 1997. It covers the missing gap between your first steps and your first dance in code.

It’s a beginner’s world

According to Robert C. Martin, the number of software developers worldwide doubles every five years. So my advice for the 20+ million beginners in the next five years out there is to read this book right before “Refactoring”. And reading “Refactoring” at least once is a pleasure you owe to yourself.

Don’t use wrapper

It’s a bad word for a piece of code, and you should feel bad for using it. Here is why:

1. It is easily phonetically confused with “rapper”

Well, this one is actually funny. Really the only redeeming quality. So if someone tells me that they “made a wrapper”, I immediately giggle a bit inside.

2. Wrapping things is a programmers job.

As programmers, we are in the business of abstractions, and a function clearly is an abstraction. A function that calls something else wraps that something else. So isn’t everything a wrapper?
Who would say that the following function is a wrapper?

template <class T>
T multiplication_wrapper(T a, T b)
{
  return a * b;
}

It does wrap the multiplication operator, does it not? Of course, the example is contrived, but many people call equally simple functions “wrapper functions”.

3. It is often a bad analogy

When you wrap something, like a present, you first need to unwrap it to actually use it. So in that case, it acts more like a kind of envelope. This is clearly not the case for what most people call wrappers. You could wrap some data in a .zip file – that would make sense! But no one uses it like that.
Another use of the word wrap implies something that goes around something else, forming a fixture of sorts. Like a wraparound baby sling. So I guess this could work for some uses, like a protection layer. Again, it is not used like that.
Finally, there’s wrapping up something, as in finishing something. Well, maybe if you’re wrapping your main function around the rest of your code, you can finish writing your program. A very monolithic approach.

There are plenty of better alternatives

In most cases, what people should rather use is either facade or adapter. Both names convey a lot more meaning than wrapper. A facade is something that wraps code to make the interface nicer. An adapter wraps an interface to integrate with some other piece of code. Both are structural design patterns. Both wrap something. But then again, that could be said for all of the structural design patterns. Or, most code. Except maybe assembler?

So please, calling something a wrapper is not enough. You might as well just call it function/object/abstraction. Use adapter, facade, decorator, proxy etc.. The why is more important than the what.

The inevitable emergence of domain events

Even if you’ve read the original Domain Driven Design book by Eric Evans, you’ve probably still not heard about domain events (or DDD Domain Events), as he didn’t include them in the book. He talked about it a lot since then, for example in this talk in 2009 in the first 30 minutes.

Domain Events

In short, domain events are occurrences of “something that domain experts care about”. You should always be on the lookout for these events, because they are integral parts of the interface between the technical world and the domain world. In your source code, both worlds condense as the same things, so it isn’t easy (or downright impossible) to tell them apart. But if you are familiar with the concept of “pure fabrication”, you probably know that a single line of code can clearly belong to the technical fabric and still be relevant for the domain. Domain events are one possibility to separate the belongings again. But you have to listen to your domain experts, and they probably still don’t tell you the full story about what they care about.

Revealed by Refactoring

To underline my point, I want to tell you a story about a software project in a big organization. The software is already in production when my consulting job brings me into contact with the source code. We talked about a specific part of the code that screamed “pure fabrication” with just a few lines of domain code in between. Our goal was to refactor the code into two parts, one for the domain code and the other, bigger one for the technical part. In the technical part, some texts get logged into the logfile, like “item successfully written to the database” and “database connection closed”. It were clearly technical aspects of the code that got logged.

One of the texts had a spelling error in it and I reached out to correct it. A developer stopped me: “Don’t do that! They filter for that exact phrase.”. That surprised me. Nothing in the code indicated the relevance of that log statement, least of all the necessity of that typo. And I didn’t know who “they” were and that the logfiles got searched. So I asked a lot of questions and finally understood the situation:

Implicit Domain Events

The developers implemented the requirements of the domain experts as given in the specification documents. Nothing in there specified the exact text or even presence of logfile entries. But after the software was done and in production, the business side (including the domain experts) needed to know how many items were added to the system in a given period. And because they needed the information right away and couldn’t wait for the next development cycle, they contacted the operation department (that is separated from the development department) and got copies of the logfiles. They scanned the logfiles with some crude regular expression magic for the entries (like “item written to the database”) and got their result. The question was answered, the problem solved and the solution even worked a second time – and a third time, and so on. The one-time makeshift script was used permanently and repeatedly, in fact, it ran every hour and scanned for new items, because it became apparent that the business not only needed the statistics, but wanted to start a business process for each new item (like an editorial review of sorts) in a timely manner.

Pinned Code

Over the course of a few weeks, the purely technical logfile entry line in the source code got pinned and converted to a crucial domain requirement without any formal specification or even notification. Nothing in the source code hinted at the importance of this line or its typo. No test, no comment, no code structure. The line looked exactly the same as before, but suddenly played in another league. Every modification at this place or its surrounding code could hamper the business. Performing a well-intended refactoring could be seen as direct sabotage. The code was sacred, but in the unspoken kind. The code became a minefield.

Extracting the Domain Event

The whole hostage situation could be resolved by revealing the domain event and making it explicit. Let’s say that we model an “item added” domain event and post it in addition to the logfile entry. How we post it is up to the requirement or capabilities of the business department. An HTTP request to another service for every new item is a simple and viable solution. A line of text in a dedicated event log file would be another option. Even an e-mail sent to an human recipient could be discussed. Anything that separates the technical logfile from the business view on the system is better than forbidden code. After the separation, we can refactor the technical parts to our liking and still have tests in place that ensure that the domain event gets posted.

Domain Events are important

These domain events are important parts of your system, because they represent things (or actions) that the business cares about. Even if the business only remembers them after the fact, try to incorporate them in an explicit manner into your code. Not only will you be able to tell domain code and technical code apart easily, but you’ll also get this precious interface between business and tech. Make a list of all your domain events and you’ll know how your system is seen in the domain expert world. Everything else is more or less just details.

What story about implicit domain events comes to your mind? Tell us in a comment or write a blog entry about it. We want to hear from you!

How to approach big tasks

In the heart of software development lies “the system”. The system is always complicated enough that you cannot fully grasp it and it is built by stacking parts on top of another that are just a tad too big to be called simple. The life of a software developer is an ongoing series of isolated projects that are at the threshold of his or her capabilities. We call these projects “epics”, “stories” or just “issues”. The sum of these projects is a system.

Don’t get me wrong – there a tons of issues that just require an hour, a cup of coffee and a few lines of code. This is the green zone of software development. You cannot possibly fail these issues. If you require twice the time, it’s still way before lunchtime. And even if you fail them, a colleague will have your back.
I’m talking about those issues that appear on your to-do list and behave like roadblocks. You dread them from far away and you know that this isn’t smooth sailing for an hour, this will be tough work for several days. This isn’t just an issue, it is an issue by itself for you. You are definitely unsure if you can make it.

Typical small project management

How do you approach such a project? It isn’t an issue anymore, as soon as you get emotionally involved, it becomes a project. Even if your emotion is just dread or fear, it is still involvement. Even if your management style is evasion, it is still project management. Sure, you can reassign a few of these icebergs, but they will always be there. You need to learn to navigate and to tackle them. Hitting an iceberg in the “frontal collision”-style isn’t a good idea.

On closer inspection, every project consists of numerous parts that you already know a solution for – typical one-hour issues – and just a few parts that you cannot estimate because you don’t know how to even start. Many developers in this situation take the route of least resistance and start with the known pieces. It’s obvious, it feels good (you are making good progress, after all!) and it defers failure into an uncertain future (aka next work week). Right now, the project is under control and on its way. We can report 80% finished because we’ve done all the known parts. How hard can the unknown parts be anyway? Until they strike hard and wreck your estimates with “unforeseen challenges” and “sudden hardships”. At least this is what you tell your manager.

Risk first!

My preferred way to approach those projects is to reveal the whole map, to estimate all parts before I delve into the details. I already know most of the easy parts, but what about the unknown and/or hard parts? I don’t know their solution so I cannot reliably estimate their size. So I sit down and try to extract the core problem that I don’t know how to solve yet. This is the thing that prohibits an estimate. This is the white area on my map. This is the “here be dragons” area. If I spend my resources doing all the work other than this, I will succeed until I stand on the border of this area and see the dragon. And I will not have sufficient resources left. My allies (like my manager and colleagues) will grow weary. I will have to fight my hardest battle in the most inconvenient setting.

My approach is to take the risk upfront. Tackle the core problem and fail. Get up and tackle it again. Fail once more. And again. If you succeed with your task, the war is won. Your project will still require work, but it’s the easy kind of work (“just work”). You can estimate the remaining tasks and even if you’ve overspent in your first battle, you reliably know how much more resources you will need.

Fail fast

And if you don’t succeed? Well, then you know it with the least damage done. Your project will enter crisis mode, but in a position when there is still time and resources left. This is the concept of “fail fast”. To be able to fail fast, you need choose the “risk first” approach of task selection. To tackle the risk first, you need to be able to quantify the “risk” of your upcoming work.

Assessing risk

There are whole books about risk assessment that are interesting and helpful, but as a starter, you only need to listen to your stomach. If your stomach tells you that you are unsure about a specific part of your project, put that part on the “risky” list. If you don’t have a reliable stomach, try to estimate the part’s size. Do the estimation game with your colleagues. Planning poker, for example, is a great tool to uncover uncertainty because the estimates will differ. Just remember: Risk isn’t correlated with size. Just because a part is big doesn’t mean it is risky, too. Your crucial part can maybe be developed in an hour or two, given an inspiration and a cup of coffee.

Failing late means you’re out of options. Failing fast means you’ve eliminated an option and moved on.

Bringing your Grails app from 2.4 to 3.3

Updating to a new framework version often needs a lot of work and investigation how to fix problems that may arise. Usually there are upgrade guides that take you most of the way and make upgrading only a grind.

This also true for Grails and our upgrade experience with it. Often there are parts where you have to invest extra work and creativity. The current upgrade of our application from 2.4.5 to 3.3.8 is no exception:

The grind

The major changes and upgrade notes are part of the documentation so I will only mention them briefly:

  • Switch to the gradle build system
  • Using YAML as main configuration
  • Migration from filters to interceptors
  • New testing framework (partly optional because you can still use the old mixin framework with a plugin)
  • Package name changes
  • Former core features are now available as plugins like gsps, datasource and GORM
  • Functional tests need to use Spock+Geb or you will face weird problems and need to do extra work (we had selenium tests using selenium-server before)
  • Integration tests work differently so work needs to be done to migrate them
  • Logging using Lockback
  • Entities often need a @Entity annotation
  • Move some files to new Locations

The tricky stuff

  • A service named CounterService conflicts with spring boot autowiring so we had to rename it
  • Our TagLib tests using JUnit4 were failing with obscure errors, porting them to Spock fixed them.
  • We have so many dependencies that running the application with gradle:bootRun fails with: Createprocess error=206; the filename or extension is too long Fortunately adding grails { pathingJar = true } to build.gradle fixes the issue
  • Environment variables for gradle:bootRun are swallowed if not prefixed with “grails.”. We are using environment variables to customize running the application on the dev machines.

 

The hard parts

The most painful part was two central plugins we are using not being available anymore: shiro and searchable.

Shiro

For shiro there are some initial ports that work well for our needs, so the challenge was mostly finding the most fitting one of the forks on github. We went with the fork of Alin Pandichi and forked it ourselves to upgrade some version definitions.

Searchable becomes ElasticSearch

The real odyssee began looking for a replacement of the abandoned searchable plugin. Fortunately there is the compelling ElasticSearch-plugin which uses almost the same API as the searchable plugin:

The plugin focus on exposing Grails domain classes for the moment. It highly takes the existing Searchable Plugin as reference for its syntax and behaviour.

Unfortunately, we were unable to get it to work with our project trying many different versions, so we decided to fork and fix it for us. The main problems were:

  • Essentially, it does not work properly with hibernate as a data store because it chokes on the JavaAssist proxies hibernate often creates for domain objects.
  • An easy to fix concurrency issue
  • Not flexible enough converters

After a lot of debugging and a couple of fixes and the new feature of being able to use a spring bean as a converter we had search working smoothly and better than ever.

Wrapping it all up

The upgrade of our application to the newest incarnation of Grails was a rocky ride and took us quite some time.

On the other hand the framework got a lot better. Especially gradle is much better to manage than the previous build system.

So we are looking forward to a much better and robust development experience in the future and hope for some less revolutionary releases and easier upgrades.

How to teach C++

In the closing Keynote of this year’s Meeting C++, Nicolai Josuttis remarked how hard it can be to teach C++ with its ever expanding complexity. His example was teaching rookies about initialization in C++, i.e. whether to use assignment =, parens () or curly braces {}. He also asked for more application-level programmers to participate.

Well, I am an application programmer, and I also have experience with teaching C++. Last year I held a C++ introductory course for experienced C programmers who mostly had never used C++ before. From my experience, I can completely agree with what Nico had to say about teaching C++. The language and its subtlety can be truly overwhelming.

Most of the complexity in C++ boils down to tuning your code for optimal performance. We cannot just leave that out, can we? After all, the sole reason to use C++ is performance, right?

My approach

Performance is one key reason for using C++, no doubt about it. There is a few more, but let us not get distracted. What is even more important is the potential to optimize for performance. But you can do that later! It is actually quite crazy what performance crimes you can get away with in C++ and still have something pretty fast overall, especially with move-semantics and improved RVO.

Given that, I picked a simple subset to start with:

  • Pass by-value only, do not use pointers nor references.
  • Structure your programs around simple data-only structs and functions transforming them.
  • Make good use of the data structures and algorithms from std.

Believe me, you too can write pretty useful programs this way. This is not too far from a data-oriented style anyways.
So, yes, we can actually leave out the performance specific parts for quite some time, while still making sure our programs can be optimized eventually.

And from there?

You can gradually start introducing references and move semantics, when measurement shows copying affecting the performance. This way you can introduce tooling like a profiler and see the effects off passing things around by reference in a language where everything, by default, is passed by value. These things will start to make sense in a context.

The arguably better approach to move semantics is of course types that prevent you from copying, but allow moving. Most resources with side-effects are like this: file handles, locks, threads. You will need to introduce RAII for this to make sense, e.g. writing ctors and dtors.

But you still do not need to write templates, use virtual-function polymorphism, or even pointers. But when you get to those, you will have a good context to use them.

Have you tought C++ and some experience to share? I would like to hear about it!

The sorry state of Grails (Plugins)

We have been developing and maintaining a complex web application on Grails since summer of 2008. By then Grails had passed the 1.0 release milestone and was really hot. A good 10 years later the application is still in use and we are trying to upgrade from Grails 2.4 to 3.3.

Upgrading Grails – a rough ride

Similar to past upgrade experiences the ride is not very smooth. Besides the major changes like the much welcomed switch to the gradle build system, interceptors instead of filters and streamlined configuration there are again a host of more subtle changes. The biggest problem for us though is the plugin situation.

It’s the plugins

In the past we had tough breaks like the abandoned selenium plugin in favor of the much better geb for functional testing. That had cost us a lot of work and many lost and not yet rewritten functional tests.

This time it seems especially hard because you two of our central plugins are not readily available anymore:

  1. Apache Shiro Plugin
  2. Compass-based Searchable Plugin

1. Shiro authentication

There still is no official release of the shiro plugin for Grails 3.x. After some searching and researching the initial port on github we decided to fork and maintain the most current forked version ourselves and try to work with it. Fortunately it was relatively easy to integrate and to update some dependencies. Our authentication and authorization works at least as good as before and we do not face additional problems. Working with interceptors feels quite good, too.

2. Search

The situation is harder with search. Compass and the searchable plugin are dead – plain and simple. The replacement for grails is the elasticsearch plugin which mostly adopted the API of the searchable plugin. Getting it to work is not that easy though. You have different versions depending on the grails 3 version you are targetting. Each plugin version targets a specific elasticsearch server version and so on. Often times (like in the default configuration) you will need a matching mapper-attachment plugin that is not available on maven in newer versions. This is mentioned somewhere in the midst of the plugin documentation.

Furthermore the plugin itself has some problems with hibernate proxies and concurrency so here we have to mess around with the plugin code once more. Once we have everything working for us like before we will try to get our patches upstream.

Marching forward

The upgrade from 2.x to 3.x is the biggest (and best) step of Grails into the right direction. On the downside it places a lot of burden on the application and plugin developers. That again increases the cost of maintaining proven applications further.

Right now we are close to a Grails 3.3 version of our application but have invested considerable effort into this upgrade.

Our current recommendation and practice is to not start new web applications based on the grails framework because there have been too many breaking changes and the maintainance cost is high. But we are keeping a close look at grails because the increased modularization and and new options like the grails-react-profile may keep grails interesting in the future.