Integrating conan, CMake and Jenkins

In my last posts on conan, I explained how to start migrating your project to use a few simple conan libraries and then how to integrate a somewhat more complicated library with custom build steps.

Of course, you still want your library in CI. We previously advocated simply adding some dependencies to your source tree, but in other cases, we provisioned our build-systems with the right libraries on a system-level (alternatively, using docker). Now using conan, this is all totally different – we want to avoid setting up too many dependencies on our build-system. The fewer dependencies they have, the less likely they will accidentally be used during compilation. This is crucial to implement portability of your artifacts.

Setting up the build-systems

The build systems still have to be provisioned. You will at least need conan and your compiler-suite installed. Whether to install CMake is a point of contention – since the CMake-Plugin for Jenkins can do that.

Setting up the build job

The first thing you usually need is to configure your remotes properly. One way to do this is to use conan config install command, which can synchronize remotes (or the whole of the conan config) from either a folder, a zip file or a git repository. Since I like to have stuff readable in plain text in my repository, I opt to store my remotes in a specific folder. Create a new folder in your repository. I use ci/conan_config in this example. In it, place a remotes.txt like this:

bincrafters https://api.bintray.com/conan/bincrafters/public-conan True
conan-center https://conan.bintray.com True

Note that conan needs a whole folder, you cannot read just this file. Your first command should then be to install these remotes:

conan config install ci/conan_config

Jenkins’ CMake for conan

The next step prepares for installing our dependencies. Depending on whether you’re building some of those dependencies (the --build option), you might want to have CMake available for conan to call. This is a problem when using the Jenkins CMake Plugin, because that only gives you cmake for its specific build steps, while conan simply uses the cmake executable by default. If you’re provisioning your build-systems with conan or not building any dependencies, you can skip this step.
One way to give conan access to the Jenkins CMake installation is to run a small CMake script via a “CMake/CPack/CTest execution” step and have it configure conan appropriatly. Create a file ci/configure_for_conan.cmake:

execute_process(COMMAND conan config set general.conan_cmake_program=\"${CMAKE_COMMAND}\")

Create a new “CMake/CPack/CTest execution” step with tool “CMake” and arguments “-P ci/configure_for_conan.cmake”. This will setup conan with the given cmake installation.

Install dependencies and build

Next run the conan install command:

mkdir build && cd build
conan install .. --build missing

After that, you’re ready to invoke cmake and the build tool with an additional “CMake Build” step. The build should now be up and running. But who am I kidding, the build is always red on first try 😉

Using protobuf with conan and CMake

In my last post, I showed how I got my feet wet while migrating the dependencies of my existing code-base to conan. The first major hurdle I saw coming when I started was adding something with a “special” build step, e.g. something like source-preprocessing. In my case, this was protobuf, where a special build-step converts .proto files to sources and headers.

In my previous solution, my devenv build scripts would install the protobuf converter binary to my devenv’s bin/ folder, which I then used to run my preprocessing. At first, it was not obvious how to do this with conan. It turns out that the lovely people and bincrafters made this pretty comfortable. conan_basic_setup() will add all required package paths to your CMAKE_MODULE_PATH, which you can use to include() some bundled CMake scripts that will either let you execute the protobuf-compiler via a target or run protobuf_generate to automagically handle the preprocessing. It’s probably worth noting, that this really depends on how the package is made. Conan does not really have an official way on how to handle this.

Let’s start with some sample code – Person.proto, like the sample from the protobuf website:

message Person {
  required string name = 1;
  required int32 id = 2;
  optional string email = 3;
}

And some sample code that uses it:

#include "Person.pb.h"

int main(int argn, char** argv)
{
  Person message;
  message.set_name("Hello Protobuf");
  std::cout << message.name() << std::endl;
}

Again, we’re using the bincrafters repository for our dependencies in a conanfile.txt:

[requires]
protobuf/3.6.1@bincrafters/stable
protoc_installer/3.6.1@bincrafters/stable

[options]

[generators]
cmake

Now we just need to wire it all up in the CMakeLists.txt

cmake_minimum_required(VERSION 3.0)
project(ProtobufTest)

include(${CMAKE_BINARY_DIR}/conanbuildinfo.cmake)
conan_basic_setup(TARGETS KEEP_RPATHS)

# This loads the cmake/protoc-config.cmake file
# from the protoc_installer dependency
include(cmake/protoc-config)

set(TARGET_NAME ProtobufSample)

# Just add the .proto files to the target
add_executable(${TARGET_NAME}
  Person.proto
  ProtobufTest.cpp
)

# Let this function to the magic
protobuf_generate(TARGET ${TARGET_NAME})

# Need to use protobuf, of course
target_link_libraries(${TARGET_NAME}
  PUBLIC CONAN_PKG::protobuf
)

# Make sure we can find the generated headers
target_include_directories(${TARGET_NAME}
  PUBLIC ${CMAKE_CURRENT_BINARY_DIR}
)

There you have it! Pretty neat, and all without a brittle find_package call.

Migrating an existing C++ codebase to conan

This is a bit of a battle report of migrating the dependencies in my C++ projects to use the conan package manager.
In the past weeks I have started to use conan in half a dozen both work and personal projects. Here’s my experiences so far.

Before

The first real project I started with was my personal game project. The “before” setup used a mixture if techniques to handle dependencies and uses CMake to do most of the heavy lifting.
Most dependencies reside in the “devenv”, which is a separate CMake project that I use to build and bundle the dependencies in a specific installation folder. It uses ExternalProject_Add for most parts (e.g. Boost, SDL, Lua, curl and OpenSSL), add_subdirectory for a few others (pugixml and lz4) and just install(FILES...) for a few header only libs like JSON for Modern C++, Catch2 and spdlog. It should be noted that there are relatively few interdependencies between the projects in there.
Because it is more convenient to update, I keep a few dependencies that I control myself directly in the source tree, either as git externals or just copies of the source files.
I try to keep usage of system dependencies to a minimum so that the resulting binary is more portable to the average gamer who does not want to know about libraries and dependencies and such nonsense. This setup has been has been mostly painless and working for my three platforms Windows, Linux and Mac – at least as long as I did not try to change it significantly.

Baby steps

Since not all my dependencies are available on conan and small iterations are usually more successful, I decided to proceed by changing only a single dependency to conan. For this dependency, it’s a good idea to pick something that does not have many compile-time options and is more or less platform agnostic. So I opted for boost over, e.g. SDL or wxWidgets. Boost was also one of the most painful dependencies to build, if only for the insane amount of files it produces and the time it takes to copy those ten-thousands of files to the install location.

Getting started..

There are currently two popular variants of boost available through conan. The “normal” variant on conan’s main repository/remote “conan-center” and a modular version that splits boost into its component libraries on the bincrafters remote, e.g. Boost.Filesystem. The modular version is more appealing conceptually, and I also had a better time getting it to work in my first tests, so I picked that. I did a quick grep for #include <boost/ through my code for an initial guess which boost libraries I needed to get and created a corresponding conanfile.txt in my project root.

[requires]
boost_filesystem/1.69.0@bincrafters/stable
boost_math/1.69.0@bincrafters/stable
boost_random/1.69.0@bincrafters/stable
boost_property_tree/1.69.0@bincrafters/stable
boost_assign/1.69.0@bincrafters/stable
boost_heap/1.69.0@bincrafters/stable
boost_optional/1.69.0@bincrafters/stable
boost_program_options/1.69.0@bincrafters/stable
boost_iostreams/1.69.0@bincrafters/stable
boost_system/1.69.0@bincrafters/stable

[options]
boost:shared=False

[generators]
cmake

Now conan plays really nice with “single configuration generators” like the new CMake/Ninja support in VS2017 and onward. Basically, just cd into your build dir and call something like conan install -s build_type=Debug -s build_type=x86 whenever you want to update dependencies. More info can be found in the official documentation. The workflow for CLion is essentially the same.

Using it in your build

After the last command, conan will download (or build) the dependencies and generate a file with all the corresponding paths.
To use it, include it from cmake like this:

include(${CMAKE_BINARY_DIR}/conanbuildinfo.cmake)
conan_basic_setup(TARGETS KEEP_RPATHS)

It will then provide targets for all the requested boost libraries that you can link to like this:

target_link_libraries(myTarget
  PUBLIC CONAN_PKG::boost_filesystem
)

I wanted to make sure that the compiler build using the new boost files and not the old ones. Because I have a generic include into my devenv that was still going to be in my compilers include-paths for all the other dependencies, so I just renamed boost’s header include folder on disc. After my first successful compile I felt confident enough to delete them.

First problems

There was one major problem: some of my in-source dependencies had their own claim on using boost via passed CMake variables, Boost_LIBRARY_DIRS and Boost_INCLUDE_DIR. I adapted their CMakeLists.txt to allow for injecting appropriate targets instead. Not the cleanest solution, but it got my builds green again fast.

There’s a still a lot to cover on this: The other platforms had their own quirks and I migrated way more than just this first project. Also, there is still ways to go for a full migration with my game project. But more on that in my next blog post…

Java’s OptionalInt et al. versus Optional<T>

In Java 8 the Optional type was introduced to avoid the (ab)use of nullable types and null to indicate the absence of a value. It allows the programmer to clearly indicate whether the potential absence of a value is intentional or accidental.

Such option types, sometimes also called Maybe types, have been established in other programming languages, mostly in statically typed functional programming languages like ML and derivatives, but are also emerging in more mainstream languages like Swift.

Java’s Optional type is, to put it mildly, not the most sophisticated implementation of this concept, mostly due to limitations of Java’s existing type system. The Optional type is nullable itself, it’s not a sum type, so it has to rely on runtime exceptions to signal invalid access of a non-existent value, but it’s still useful. Static analysers, usually built into IDEs, can do what the compiler doesn’t and warn if the value is accessed without checking for its presence first.

The Optional type suffers from another limitation of Java’s type system: the fact that primitive types like int, long, double etc. and reference types, derived from Object, aren’t unified in a single type hierarchy. Related to that, primitive types can’t be used as generic type parameters in Java. The language works around this with additional boxed types like Integer, Long and Double for each primitive type.

When the stream API and the Optional type were introduced in Java 8, those primitive types were once again treated with special types: there’s not just Stream<T>, but also IntStream, LongStream, DoubleStream, there’s not just Optional<T>, but also OptionalInt, OptionalLong, OptionalDouble, the same for consumers, suppliers, predicates and functions.

This was done to avoid boxing and unboxing, but also makes it unpleasant to use. What’s worse is that the Optional variants for the primitive types don’t offer the same functionality as Optional<T>: they are lacking the filter, map and flatMap methods as well as the ofNullable factory method. All in all they are less useful than the real Optional, and there’s no convenient way to convert back and forth between, for example, an OptionalInt and Optional<Integer>.

The above mentioned annoyances are the reason why we prefer the generic variant over the special ones for the primitive types by default. Hopefully a future Java release will mitigate this dichotomy between those types, at least by adding the missing methods, but we are not aware of any plans for this yet.

Using WPF-Toolkits CheckComboBox with Data-Binding

Xceed’s WPF Toolkit is a popular extension to the standard components offered by Microsoft’s WPF. One fancy control that I have been using lately is the CheckComboBox, which is a ComboBox that show’s a list of items and checkboxes when opened and a list of selected items when closed. For example, it is great for selecting filtering options in smaller sets.
However, it took me a little bit to get it all up and running with DataBinding. I am going to walk you throught it. For reference, I’m starting with a .NET 4.6.1 WPF App in Visual Studio 2017.

First you have to install Extended.Wpf.Toolkit, which I am doing via VS’s built-in package manager. To actually use the control, I am adding an XML namespace into my MainWindow’s XAML:

xmlns:xctk="http://schemas.xceed.com/wpf/xaml/toolkit"

Then I’m adding the control in a simple StackPanel, while already adding DataBindings:

<xctk:CheckComboBox
  ItemsSource="{Binding Path=Options}"
  DisplayMemberPath="Name"
  SelectedMemberPath="Selected"/>

This means that my control will look at a collection named “Options” in my view-model, using it’s elements “Name” property for display and its “Selected” property for the checkmark. If you run the program at this point, you should be able to see an empty CheckComboBox, albeit badly layouted.

Now it’s time to create the view model. Let’s start with a small class-let to represent our items:

class Item
{
  public string Name { get; set; }
  public bool Selected { get; set; }
}

As you can see, the names match what we set for DisplayMemberPath and SelectedMemberPath in the XAML. Now for the ViewModel class:

class ViewModel
{
  public ViewModel()
  {
    var languages = new string[]
    {
      "C", "C#", "C++", "D", "Java",
      "Rust", "Python", "ES6"
    };
    
    Options = new List<Item>();
    foreach (var language in languages)
    {
      Options.Add(new Item {
          Name = language,
          Selected = true });
    }
  }
  
  public List<Item> Options { get; set; }
}

If you run it at this point, you should be able to see an all-selected list of programming languages in the drop-down. But it is lacking a crucial detail: it is not observable, meaning the component will not be notified if the data in the view-model is changed by other means. To make sure that it can, the Item list and the Item have to implement the INotifyPropertyChanged interface. To do that, you have to fire a specific event whenever a property changes with the name of that property in it.

Let’s do that for the Item first:

class Item : INotifyPropertyChanged
{
  private bool _selected;
  private string _name;

  public string Name
  {
      get => _name; set
      {
        _name = value;
        EmitChange(nameof(Name));
      }
  }
  public bool Selected
  {
    get => _selected; set
    {
      _selected = value;
      EmitChange(nameof(Selected));
    }
  }

  private void EmitChange(params string[] names)
  {
    if (PropertyChanged == null)
      return;
    foreach (var name in names)
      PropertyChanged(this,
        new PropertyChangedEventArgs(name));
  }

 public event PropertyChangedEventHandler
                PropertyChanged;
}

That got bigger! But it’s not a lot of meat really. For the Item list, we can just use ObservableCollection instead of List:

public ObservableCollection<Item> Options {get; set;}

That’s it. Two-way data binding set-up for the item collection, and you can now change the view-model and have the component react to it, but also react to changes from the component by hooking into the property-set functions.
Now you could also implement INotifyPropertyChanged for the ViewModel, if you intend to swap in new ObserableCollections, but that is not necessary for this example.

Configurable React backend in deployment

In my last post I explained how to make you React App configurable with the backend endpoint as an example. I did not make clear that the depicted approach is build-time configurability.

If you want deploy- or runtime-time configurability the most simple approach is to provide global variables in your index.html like so:

<!DOCTYPE html>
<html lang="en">
  <head>
    <script>
      window.REACT_APP_BACKEND_API_BASE_URL= 'http://some.other.server:5000';
      window.APPLICATION_CONFIGURATION = {
        settingA: 'aValue',
        anotherSetting: 'anotherValue'
      };
    </script>
  </head>
  <body>
    <noscript>
      You need to enable JavaScript to run this app.
    </noscript>
    <div id="root"></div>
  </body>
</html>

We use (or activate) this configuration similar to the build-time approach with .env files:

// If we have a differing backend configured, replace the global fetch()
// instead of process.env.REACT_APP_BACKEND_API_BASE_URL
// we now use window.REACT_APP_BACKEND_API_BASE_URL
if (window.REACT_APP_BACKEND_API_BASE_URL !== undefined
    && window.REACT_APP_BACKEND_API_BASE_URL !== '') {
  applyBaseUrlToFetch(window.REACT_APP_BACKEND_API_BASE_URL);
}

That way an automated process or a human administrator can deploy the same artifact to different servers with customized settings. This approach is briefly explained in the create-react-app documentation. In addition a server-side application could replace placeholders dynamically in the html file, e.g. with data from a configuration database.

I personally like this approach because it allows us to use the same build artifact for internal testing, staging systems and production at the clients site. It also allows the client to make some basic configuration themselves.

Zero, Maybe, One and Many

Implementing a data model in a way that it supports you is hard. By following the rule of thumb about associations presented here, the task gets a bit easier.

In object-oriented programming languages like Java, the compiler will improve its helpfulness if the application provides a rich type system or strong domain model. There is a whole field of study for type systems, called type theory, that is fascinating and helpful, but does not provide easy rules to follow for beginning software developers. This blog entry proposes a simple set of rules for a specific part of type systems (associations among types) that can be applied to a domain model as a rule of thumb. The resulting model will empower the compiler and the code completion of the IDE to help the developer with writing correct code.

Data knows data

Even the most basic domain models separate the data in multiple entities (often classes). For example, an employee class has an internal id, but knows about a person class and a salary class that are associated with this employee. This “knowing about” is modeled as a reference to a person object and a salary object. In this case, the reference is probably of the type “one”: The employee object knows about one person object and one salary object. This is the usual way to structure data.

If you learn about the UML notation of data models, you’ll see that associations (aka references between objects) are given great emphasis. There are several different kinds of associations that can be customized by multiplicities and such. It seems that knowing other data is a complex issue for types. It doesn’t have to be this way. Here are four ways of knowing other data that are sufficient for nearly every use case: Zero, Maybe, One and Many.

Four basic types of association

Zero: Knowing zero elements of something different is the usual default case: Your employee object probably doesn’t need to know about the payroll object of the company and therefore has no association to it. This means that there is no member variable of the type Payroll in the class Employee. No developer ever modeled a “zero” association by declaring a member variable and setting it to null. This would be ridiculous. We just omit the member variable and are done. Knowing zero elements of something is easy.
One: Yes, I’ve omitted Maybe at the moment. I’ll come back to it. Knowing one element of something is also not hard: You declare a member variable of the type, give it a good name (that’s the hard part!) and ensure that every instance of your class (every object) has a valid reference to an object of something’s type. If you call methods on this reference, you call methods on the object you know. As long as you live, the other object cannot disappear. Knowing one element of something is a long-lasting relationship.
Maybe: Sometimes, you want to know an element of something that isn’t there yet or you knew an element of something once, but it is gone. You know “maybe one” element of something. These associations are typically programmed in a cumbersome way by many developers. Instead of embedding the “maybe” aspect in the type system and giving the compiler a chance to help, it is burdened solely onto the developer’s shoulders by implementing the “maybe” like a “one” with the added possibility of a null reference if the element isn’t there. A direct result of this approach are null-checks in the code or NullPointerExceptions at places without such checks. One possibility to elevate the “maybeness” into the type system is to implement the association with a Maybe or Optional type. Instead of referencing a Salary directly that might be null if an employee isn’t salaried anymore, the Employee class references an Optional<Salary> object. This object might “contain” a salary or it might not. With a few adjustments to the conditional flow of the code, this distinction between “something is there” and “something is not there” doesn’t matter anymore. If the code is free from implicit Optional types (references that can be null), a whole category of bugs disappears and the code is freed from manually programmed type system checks. Probably knowing one element is the type of assocation that requires some thought and is often done on the wrong level.
Many: As soon as you want to know more than one element of something, you fall into the “many” category. Many-associations are not so easy to handle, because there are so many possibilities to express them. The basic types are arrays or lists. My recommendation is to use lists whenever feasible and only resort to arrays if it is necessary, because arrays are fixed-length and have the same problem of maybe-null-references: An array index might have been written yet or not. If you refrain from storing null references into lists, they express their filling level a lot clearer than arrays. And given advanced features like iterators, there isn’t even a need to ask for the filling level. An interesting observation is that the list-based many-association can also serve as a zero-, maybe- or one-association. It is possible to replace all other types of association with lists. You probably won’t want to do this, because with the maximization of multiplicity flexibility comes more complexity and reduced readability of the code. You should strive to minimize complexity. Only add many-associations if you really need them. Even just replacing a “maybe” (Optional) with a “many” (List) is a source of much unwanted code and uncertainty.

Advanced types of association

Of course, there are many more types of association that you’ll eventually need. A good example is the qualified association, often implemented by a Map/Dictionary that translates from the qualifier type to the qualified type. But they are rare in comparison to the four basic types.

Summary

If you get your basic associations right, your domain model will help your compiler and IDE to support and guide you. This is an upfront investment that pays off manyfold over the course of the project and eliminates the burden of attention to detail when it comes to accidental complexity like null pointers. Your project’s domain probably doesn’t contain null pointers, but the concepts of knowing zero, maybe, one and many.

Cache configuration with WildFly, Infinispan, CDI and JCache

This post is about a specific problem I encountered using the WildFly application server in combination with the Infinispan cache module, CDI and the JCache API. If you don’t use this combination of technologies this post is probably not relevant or interesting to you, but I hope it will help someone who encounters the same problem.

The problem

After upgrading an application from WildFly 10 to WildFly 13 it became apparent that the settings for the Infinispan caches from the WildFly configuration file are no longer applied to the caches used by the application.

The cache settings in the WildFly configuration specify a cache container, several local caches and the object memory sizes and expiration lifespans of these caches:

<subsystem xmlns="urn:jboss:domain:infinispan:6.0">
  <cache-container name="myapp" default-cache="default" module="org.wildfly.clustering.web.infinispan" statistics-enabled="true">
    <local-cache name="default" statistics-enabled="true">
      <object-memory size="10000"/>
      <expiration lifespan="86400000"/>
    </local-cache>
    <local-cache name="foo" statistics-enabled="true">
      <object-memory size="10000"/>
      <expiration lifespan="600000"/>
    </local-cache>
  </cache-container>
</subsystem>

The cache manager is injected via CDI resource injection in a Config class as the default cache manager:

class Config {
    @Produces
    @Resource(lookup = "java:jboss/infinispan/container/myapp")
    private EmbeddedCacheManager defaultCacheManager;
}

The caches are used via the @CacheResult annotation from the JCache API (JSR-107):

class FooService {
    @CacheResult(cacheName = "foo")
    public List<Foo> getFoo(String query) {
        // ...
    }
}

With this setup the application worked, the service results were cached, but the cache settings from the configuration file were not applied, as could be seen by inspecting the MBeans of the caches via JConsole. Instead the caches used a default configuration with an expiration lifespan of -1 (never), even though they were assigned to the cache container “myapp” as configured.

The solution

One particular answer to a similar problem description on StackOverflow was helpful in finding the solution. Each cache must be injected once via CDI resource lookup as well:

import org.infinispan.Cache;

class Config {
    @Resource(lookup = "java:jboss/infinispan/cache/myapp/foo")
    private Cache<String, Object> fooCache;

    // ...
}

The format of the JNDI path is:

"java:jboss/infinispan/cache/${cacheContainerName}/${cacheName}"

The property itself will be unused, but the @CacheResult annotation will now use the cache with the correct configuration.

Making the backend of your React App configurable

Nowadays, the frontend and backend of a web application usually are separate parts – oftentimes implemented using different technologies – communicating with each other using HTTP or websockets. For simplicity and smaller deployments they are hostet on the same web server. There are several reasons to deploy them on different servers like load distribution, security, different environments running the same frontend with differing backends and so on.

To allow separate deployments without changing the frontend code per deployment we need to make the backend transparently configurable. Fortunately, this is relatively easy for frontend written in React and set up with create-react-app. To make this fully transparent for your frontend code we need to

Make the backend URL configurable
Replace the fetch() function to use the configured backend
Activate the setup at the start of our app

Configuring a React App

Create-react-app provides a configuration mechanism with custom environment variables using .env-files. We can simply provide different env-files for our environments where we can configure different aspects of our application. In our use case this is the backend URL.

// The base url of the backend API. Add path prefix if the API does not run at the server root.
REACT_APP_BACKEND_API_BASE_URL=http://some.other.server:5000

Inside our React App we can reference the configured values using {process.env.REACT_APP_BACKEND_API_BASE_URL}.

Making the use of our configured backend transparent

In a modern JavaScript app the main mean to communicate with the backend is the fetch()-API. To make the use of our configured backend transparent we can replace the global fetch()-function with our version like so:

// remember the original fetch-function to delegate to
const originalFetch = global.fetch;

export const applyBaseUrlToFetch = (baseUrl) =&gt; {
  // replace the global fetch() with our version where we prefix the given URL with a baseUrl
  global.fetch = (url, options) =&gt; {
    const finalUrl = baseUrl + url;
    return originalFetch(finalUrl, options);
  };
};

That way all of our fetch() calls are re-routed to the configured backend.

Activating our fetch()-customization

Now that we have all the pieces of our infrastructure in place we need to activate the changes to fetch on application startup. So we add code like below to our index.js:

// If we have a differing backend configured, replace the global fetch()
if (process.env.REACT_APP_BACKEND_API_BASE_URL !== undefined &amp;&amp; process.env.REACT_APP_BACKEND_API_BASE_URL !== '') {
  applyBaseUrlToFetch(process.env.REACT_APP_BACKEND_API_BASE_URL);
}

Now all our calls to a relative URL will be prefixed with a configurable base and that way different backends can be used with the same application code.

Caveats

The above approach works nicely if you have exactly one backend for your app and do not fetch from other sources. If you do, you may want to expose the original fetch function as something like fetchExternal() to be able to explicitly fetch from other sources.

In addition, if frontend and backend reside on different servers/sites using differring DNS-names you will have to configure CORS for your backends or your browser will refuse to make the requests!

Did Java just flip the switch?

Switch statements in Java used to be ugly, clunky and error-prone. That is about to change considerably and here’s how.

Twenty years ago, a groundbreaking book was published: Refactoring by Martin Fowler. In this book, we learnt about 72 ways to improve our code and, even more important, over 20 unique signs of bad code, so-called code smells. Among these code smells were obvious ones like “Duplicated Code” and “Long Parameter List” and more specific ones like “Temporary Field” and “Switch Statements”.

Switch is the main offender

What is wrong with a Switch Statement, you ask? Well, nearly everything. Let’s review three flaws of a classic switch statement in Java on different levels:

Syntax: The syntax of a switch is clunky at best. Whoever thought that “fall-through” should be the default behaviour and subsequently forced millions of developers to “break” their cases is responsible for so much unnecessary extra work. Think about how a “fallthrough” statement instead of a “break” could have changed the world.
Code Design: Each switch statement is an inherent complexity hog. At least if you measure classic complexity metrics like McCabe or cyclomatic complexity. Anything but the smallest switches results in complexity counts that are through the roof. And a small switch is just a syntactically bloated if/else.
Programming Paradigm: The reason Martin Fowler advocated against using switch statements is because the alternative, using polymorphism to implicitly switch over the object type, wasn’t common knowledge 20 years ago. Switch statements were the cornerstones of explicit conditional logic and were prone to repetition, leading to duplicated code – another code smell.

There are more things wrong with a classic switch statement, but the logic is clear: Take away the culmilations of explicit conditional logic and developers will adjust their approach and adopt more diverse paradigms. If you think this through, you can also argue that taking away the “else” keyword (as the Object Calisthenics do) or even the “if” statement (as advocated by the anti-if campaign) leads to even more diversity and progressive programming.

Switch in rehabilitation?

For me, a switch statement was nearly always the wrong choice for a given problem. And experienced thinkers like Martin Fowler backed my opinion, so I couldn’t be wrong – right?

In the second edition of Refactoring, published early this year, Martin Fowler changes his position towards the Switch Statement considerably. A single switch isn’t the gateway drug to imperative programming anymore. You’ll need to have “Repeated Switches” to count as a code smell. You can still use “Replace Conditional with Polymorphism”, but the enthusiasm about implicit condititonal structures like polymorphism has faded. Martin Fowler writes that today, we all know about the different ways to express conditional logic. I’m not so sure. He also writes that many languages support more sophisticated forms of switch statements. Ok, but what about the mainstream languages like Java?

My biggest problem with the classic switch statement was that it was a “single purpose” structure. It could only be used to jump to a limited number of code addresses based on a limited type of criterium. I prefer code structures that are “dual purpose” or even “multi-purpose”. When Java’s switch statement got upgraded to switch over Enums (Java 5) and Strings (Java 7), it got more powerful, but still only supported one use case: explicit branching over a condition.

Switch with dual use

In the upcoming Java 12 (yes, we’ve come a long way in terms of version numbers since Java 8), the “Java Enhancement Process” JEP 325 will be included: Switch Expressions. It is marked as a preview language feature, meaning it is ready for usage, but open for discussion – and you’ll have to enable it explicitly. In the grand scheme of things for Java, it is a stepping stone for JEP 305: Pattern Matching for instanceof that will also change the switch statement even further.

With Switch Expressions, you can use a switch statement to essentially inline a method that uses lots of explicit conditionals to map one value to another:

int numLetters = switch (day) {
    case MONDAY, FRIDAY, SUNDAY -> 6;
    case TUESDAY                -> 7;
    case THURSDAY, SATURDAY     -> 8;
    case WEDNESDAY              -> 9;
};

Your switch can now return a result. And with that improvement, it isn’t single purposed anymore. This is the moment when a switch statement isn’t the most clunky and error-prone way to solve a problem anymore, but maybe even elegant and straight to the point.

This is the moment I definitely change my opinion about the switch statement (in Java) and welcome it back into my solution toolbox. How could such an ugly duckling become such a beautiful swan? And why did this take us twenty years?

You can read more about the new switch statement in this brilliant blog post from Nicolai Parlog.

Anyways, the “else” keyword is now even more obsolete than ever.

	Anonymous on Cache configuration with WildF…
	Miq on Nested queries like N+1 in pra…
	mariuselvert on Creating functors with lambda…
	Nested queries like… on Common SQL Performance Gotchas…
	Nested queries like… on Make your users happy by not c…