Readability of Guard Clauses in Methods

A little story about two opinions on readability of methods containing if-clauses.

Browsing through the code base of one of our customers I frequently stumbled over methods that were roughtly structured like this:

void theMethod
{
  if (some_expression)
  {
    // rest of the method body
    // ...
  }
  // no more code here!
}

And most of the time I was tempted to refactor the method using a guard clause, like so:

void theMethod
{
  if (!some_expression)
  {
    return;
  }
  // rest of the method body
  // ...
}

because this is far more readable for me. When I noticed that the methods were written all by the same guy I told him about by refactoring ideas in absolute certainty that he would agree with me. It came as quite a surprise when, in fact, he didn’t agree with me, at all. Even something like this:

void theMethod
{
  if (some_expression)
  {
    // some code
    // ...
    if (another_expression)
    {
      // some more code
      // ...
    }
    // no more code here ..
  }
  // ... and here
}

was in his eyes far more readable than the refactored version with guard clauses. His rational was that guard clauses make it harder for to see the program flow through the method. And a nested if(…) structure like above was very suitable to express slightly more complicated flows.

All my talks about crappy methods and the downsides of highly indented code were not able to change his mind.

I admit that I can somewhat understand his point about the visibility of the program flow through the method.  And sure, the (nested) ifs increase indentation and the number of possible code paths but since there are no elses and no code after the if-blocks, does that really increase the overall complexity?

Well, I still would prefer smaller methods with guard clauses but as you can see, to a great extend readability lies in the eyes of the beholder.

What do you find readable?

Start with the core

What’s the most important feature you cannot live without? Start with it, you can stop anytime because after that you have a minimal yet usuable system.

If you begin your new product, start from the core functionality. Not the core of the system or architecture.
Ask yourself: What’s the most important feature you cannot live without? Just name one, only one. Build this first and only. Eventually refine or redo if it doesn’t suit your needs. Then continue with the next. You always have a minimal but useable product and can stop anytime.
We once had to develop a web based system where users can file an application which can be reviewed, changed, rated and finally accepted or declined. We started with a web page where you could download a PDF and an email address to send it to when complete. Minimal? Yes. Useless? No. All the functionality which was needed was there. All other stuff could be done via email or phone. It was readily available and useable.
Start with the core.

Verbosity is not Java’s fault

One of Java’s most heard flaws (verbosity) isn’t really tied to the language it is rooted in a more deeply source: it comes from the way we use it.

Quiz: Whats one of the most heard flaws of Java compared to other languages?

Bad Performance? That’s a long overhauled myth. Slow startup? OK, this can be improved… It’s verbosity, right? Right but wrong. Yes, it is one of the most mentioned flaws but is it really inherit to the language Java? Do you really think Closures, annotations or any other new introduced language feature will significantly reduce the clutter? Don’t get me wrong here: closures are a mighty construct and I like them a lot. But the source of the problem lies elsewhere: the APIs. What?! You will tell me Java has some great libraries. These are the ones that let Java stand out! I don’t talk about the functionality of the libraries here I mean the design of the API. Let me elaborate on this.

Example 1: HTML parsing/manipulation

Say you want to parse a HTML page and remove all infoboxes and add your link to a blog box:

        DOMFragmentParser parser = new DOMFragmentParser();
        parser.setFeature("http://xml.org/sax/features/namespaces", false); 
        parser.setFeature("http://cyberneko.org/html/features/balance-tags", false);
        parser.setFeature("http://cyberneko.org/html/features/balance-tags/document-fragment", true);
        parser.setFeature("http://cyberneko.org/html/features/scanner/ignore-specified-charset", true);
        parser.setFeature("http://cyberneko.org/html/features/balance-tags/ignore-outside-content", true);
        HTMLDocument document = new HTMLDocumentImpl();
        DocumentFragment fragment = document.createDocumentFragment();
        parser.parse(new InputSource(new StringReader(html)), fragment);
        XPathFactory factory = XPathFactory.newInstance();
        XPath xpath = factory.newXPath();
        Node infobox = xpath.evaluate("//*/div[@class='infobox']", fragment, XPathConstants.NODE);
        infobox.getParentNode().removeChild(infobox);
        Node blog = xpath.evaluate("//*[@id='blog']", fragment, XPathConstants.NODE);
        NodeList children = blog.getChildNodes();
        for (int i = 0; i < children.getLength(); i++) {
            node.remove(children.item(i));
        }
        blog.appendChild(/*create Elementtree*/);

What you really want to say is:

HTMLDocument document = new HTMLDocument(url);
document.at("//*/div[@class='infobox']").remove();
document.at("//*[@id='blog']").setInnerHtml("<a href='blog_url'>Blog</a>");

Much more concise, easy to read and it communicates its purpose clearly. The functionality is the same but what you need to do is vastly different.

  The library behind the API should do the heavy lifting not the API's user.

Example 2: HTTP requests

Take this example of sending a post request to an URL:

HttpClient client = new HttpClient();
PostMethod post = new PostMethod(url);
for (Entry param : params.entrySet()) {
    post.setParameter(param.key, param.value);
}
try {
    return client.executeMethod(post);
} finally {
    post.releaseConnection();
}

and compare it with:

HttpClient client = new HttpClient();
client.post(url, params);

Yes, there are cases where you want to specify additional attributes or options but mostly you just want to send some params to an URL. This is the default functionality you want to use, so why not:

  Make the easy and most used cases easy,
    the difficult ones not impossible to achieve.

Example 3: Swing’s JTable

So what happens when you designed for one purpose but people usually use it for another one?
The following code displays a JTable filled with attachments showing their name and additional actions:
(Disclaimer: this one makes heavy use of our internal frameworks)

        JTable attachmentTable = new JTable();
        TableColumnBinder<FileAttachment> tableBinding = new TableColumnBinder<FileAttachment>();
        tableBinding.addColumnBinding(new StringColumnBinding<FileAttachment>("Attachments") {
            @Override
            public String getValueFor(FileAttachment element, int row) {
                return element.getName();
            }
        });
        tableBinding.addColumnBinding(new ActionColumnBinding<FileAttachment>("Actions") {
            @Override
            public IAction<?, ?>[] getValueFor(FileAttachment element, int row) {
                return getActionsFor(element);
            }
        });
        tableBinding.bindTo(attachmentTable, this.attachments);

Now think about you had to implement this using bare Swing. You need to create a TableModel which is unfortunately based on row and column indexes instead of elements, you need to write your own renderers and editors, not talking about the different listeners which need to map the passed indexes to the corresponding element.
JTable was designed as a spreadsheet like grid but most of the time people use it as a list of items. This change in use case needs a change in the API. Now indexes are not a good reference method for a cell, you want a list of elements and a column property. When the usage pattern changes you can write a new library or component or you can:

  Evolve your API.

Designed to be used

So why is one API design better than another? The better ones are designed to be used. They have a clearly defined purpose: to get the task done in a simple way. Just that. They don’t want to satisfy a standard or a specification. They don’t need to open up a huge new world of configuration options or preference tweaks.

Call to action

So I argue that to make Java (or your language of choice) a better language and environment we have to design better APIs. Better designed APIs help an environment more than just another new language feature. Don’t jump on the next super duper language band wagon because it has X or Y or any other cool language feature. Improve your (API) design skills! It will help you in every language/environment you use and will use. Learning new languages is good to give you new viewpoints but don’t just flee to them.

Aligning the Abstraction Level with constant booleans

Constant booleans can help to maintain a single level of abstraction in one method. They are less expensive than a separate method and a big improvement over a mere comment.

If you ever have done consulting, mentoring or teaching on programming techniques I’m sure you have experienced joy as well as disappointment when your “students” either took on your advice and followed it in their day-to-day work or when they just did what you said as long as you sat next to them but forgot all about it the next day. The disappointing behavior often comes from them not fully appreciating, or not being able to fully recognize the advantages of your solution. (And as you are the mentor/consultant/teacher, the latter might also be your fault).

One example for that is the principle to operate on only one level of abstraction within a method or function. See here for a detailed explanation. I have been applying this technique more or less unconsciously already for a long time now and was reminded of it as the Single-Level-of-Abstraction-Principle in Robert C. Martin’s Clean Code.

I have been trying to put this principle in peoples minds for some time now but often with little success. Sure, they often do see the advantages of arriving at much more readable code but they often ignore it in their own code. Most of the time they just don’t see the necessity to create another method with a meaningful name or they content themselves just with putting a comment above some chunk of lower abstraction code (The resulting loud screams for a Extract-Method refactoring often remain unheard, too)

Lately, I did have fairly good success with one little sub-technique of this principle: constant booleans for if-statements. That is, instead of (C++ code):

void someMethod()
{
   if (hard_to_read_boolean_expression_using_lower_abstractions)
   {
      // do stuff
   }
}

you write:

void someMethod()
{
   const bool expressive_name = 
      hard_to_read_boolean_expression_using_lower_abstractions;
   if (expressive_name)
   {
      // do stuff
   }
}

I guess the main reason for the success of this sub-technique is that it increases readability a lot at a cost that is only a tiny bit greater than a simple comment.

True, in many cases it may be even more readable to put the whole if-statement in another method, but using a boolean constant like above is already a big improvement.

A more elegant way to equals in Java

Implementing equals and hashCode in Java is a basic part of your toolbox. Here I describe a cleaner and less error-prone way to use in your code.

— Disclaimer: I know this is pretty basic stuff but many, many programmers are doing it still wrong —
As a Java programmer you know how to implement equals and that hashCode has to be implemented as well. You use your favorite IDE to generate the necessary code, use common wisdom to help you code it by hand or use annotations. But there is a fourth way: introducing EqualsBuilder (not the apache commons one which has some drawbacks over this one) which implements the general rules for equals and hashCode:

public class EqualsBuilder {

  public static interface IComparable {
      public Object[] getValuesToCompare();
  }

  private EqualsBuilder() {
    super();
  }

  public static int getHashCode(IComparable one) {
    if (null == one) {
      return 0;
    }
    final int prime = 31;
    int result = 1;
    for (Object o : one.getValuesToCompare()) {
      result = prime * result
                + EqualsBuilder.calculateHashCode(o);
    }
    return result;
  }

  private static int calculateHashCode(Object o) {
    if (null == o) {
      return 0;
    }
    return o.hashCode();
  }

  public static boolean isEqual(IComparable one,
                                              Object two) {
    if (null == one || null == two) {
      return false;
    }
    if (one.getClass() != two.getClass()) {
      return false;
    }
    return compareTwoArrays(one.getValuesToCompare(),
              ((IComparable) two).getValuesToCompare());
  }

  private static boolean compareTwoArrays(Object arrayOne, Object arrayTwo) {
      if (Array.getLength(arrayOne) != Array.getLength(arrayTwo)) {
        return false;
      }
      for (int i = 0; i < Array.getLength(arrayOne); i++) {
        if (!EqualsBuilder.areEqual(Array.get(arrayOne, i), Array.get(arrayTwo, i))) {
          return false;
        }
      }
      return true;
  }

  private static boolean areEqual(Object objectOne, Object objectTwo) {
    if (null == objectOne) {
      return null == objectTwo;
    }
    if (null == objectTwo) {
      return false;
    }
    if (objectOne.getClass().isArray() && objectTwo.getClass().isArray()) {
        return compareTwoArrays(objectOne, objectTwo);
    }
    return objectOne.equals(objectTwo);
  }

}

The interface IComparable ensures that equals and hashCode are based on the same instance variables.
To use it your class needs to implement the interface and call the appropiate methods from EqualsBuilder:

public class MyClass implements IComparable {
  private int count;
  private String name;

  public Object[] getValuesToCompare() {
    return new Object[] {Integer.valueOf(count), name};
  }

  @Override
  public int hashCode() {
    return EqualsBuilder.getHashCode(this);
  }

  @Override
  public boolean equals(Object obj) {
    return EqualsBuilder.isEqual(this, obj);
  }
} 

Update: If you want to use isEqual directly one test should be added to the start:

  if (one == two) {
    return true;
  }

Thanks to Nyarla for this hint.

Update 2: Thanks to a hint by Alex I fixed a bug in areEqual: when an array (especially a primitive one) is passed than the equals would return a wrong result.

Update 3: The newly added compareTwoArrays method had a bug: it resulted in true if arrayTwo is bigger than arrayOne but starts the same. Thanks to Thierry for pointing that out.

Forced into switch/case – Qt’s Model/View API

During my life as a programmer I have more and more come to dislike switch/case statements. They tend to be hard to grasp and with languages like C/C++ they are often the source of hard-to-find errors. Compilers that have warnings about missing default statements or missing cases for enumerated values can help to mitigate the situation, but still, I try to avoid them whenever I can.

The same holds true for if-elseif cascades or lots of if-elses in one method. They are hard to read, hard to maintain, increase the Crap, etc.

If you share this kind of mindset I invite you implement to some custom models with Qt4’s Model/View API. The design of the Model/View classes is derived from the well-known MVC pattern which separates data (model), presentation (view) and application logic (controller). In Qt’s case, view and controller are combined, supposedly making it simpler to use.

The basic idea of Qt’s implementation of its Model/View design is that views communicate with models using so-called model indexes. Using a table as an example, a row/column pair of (3,4) would be a model index pointing to data element in row 3, column 4. When a view is to be displayed it asks the attached model for all sorts of information about the data.

There are a few model implementations for standard tasks like simple string lists (QStringListModel) or file system manipulation (QDirModel < Qt4.4, QFileSystemModel >= Qt4.4). But usually you have to roll your own. For that, you have to subclass one of the abstract model classes that suits your needs best and implement some crucial methods.

For example, model methods rowCount and columnCount are called by the view to obtain the range of data it has to display. It then uses, among others, the data method to query all the stuff it needs to display the data items. The data method has the following signature:

QVariant data ( const QModelIndex&amp; index, int role ) const

Seems easy to understand: parameter index determines the data item to display and with QVariant as return type it is possible to return a wide range of data types. Parameter role is used to query different aspects of the data items. Apart from Qt::DisplayRole, which usually triggers the model to return some text, there are quite a lot other roles. Let’s look at a few examples:

  • Qt::ToolTipRole can be used to define a tool tip about the data item
  • Qt::FontRole can be use to define specific fonts
  • Qt::BackgroundRole and Qt::ForegroundRole can be used to set corresponding colors

So the views call data repeatedly with all the different roles and your model implementation is supposed to handle those different calls correctly. Say you implement a table model with some rows and columns. The design of the data method is forcing you into something like this …

QVariant data ( const QModelIndex&amp; index, int role ) const  {
   if (!index.isValid()) {
      return QVariant();
   }

   switch (role)
   {
      case Qt::DisplayRole:
         switch (index.column())
         {
            case 0:
               // return display data for column 0
               break;
            case 1:
               // return display data for column 1
               break;
            ...
         }
         break;

      case Qt::ToolTipRole:
         switch (index.column())
         {
            case 0:
               // return tool tip data for column 0
               break;
            case 1:
               // return tool tip data for column 1
               break;
            ...
         }
         break;
      ...
   }
}

… or equivalent if-else structures. What happens here? The design of the data method forces the implementation to “switch” over role and column in one method. But nested switch/case statements? AARGH!! With our mindset outlined in the beginning this is clearly unacceptable.

So what to do? Well, to tell the truth, I’m still working on the best™ solution to that but, anyway, here is a first easy improvement: handler methods. Define handler methods for each role you want to support and store them in a map. Like so:

#include &lt;QAbstractTableModel&gt;

class MyTableModel : public QAbstractTableModel
{
  Q_OBJECT

  typedef QVariant (MyTableModel::*RoleHandler) (const QModelIndex&amp; idx) const;
  typedef std::map&lt;int, RoleHandler&gt; RoleHandlerMap;

  public:
    enum Columns {
      NAME_COLUMN = 0,
      ADDRESS_COLUMN
    };

    MyTableModel() {
      m_roleHandlerMap[Qt::DisplayRole] =
         &amp;MyTableModel::displayRoleHandler;
      m_roleHandlerMap[Qt::ToolTipRole] =
         &amp;MyTableModel::tooltipRoleHandler;
    }

    QVariant displayRoleHandler(const QModelIndex&amp; idx) const {
      switch (idx.column()) {
        case NAME_COLUMN:
          // return name data
          break;

        case ADDRESS_COLUMN:
          // return address data
          break;

        default:
          Q_ASSERT(!&quot;Invalid column&quot;);
          break;
      }
      return QVariant();
    }

    QVariant tooltipRoleHandler(const QModelIndex&amp; idx) const {
      ...
    }

    QVariant data(const QModelIndex&amp; idx, int role) const {
      // omitted: check for invalid model index

      if (m_roleHandlerMap.count(role) == 0) {
        return QVariant();
      }

      RoleHandler roleHandler =
        (*m_roleHandlerMap.find(role)).second;
      return (this-&gt;*roleHandler)(idx);
    }
  private:
    RoleHandlerMap m_roleHandlerMap;
};

The advantage of this approach is that the supported roles are very well communicated. We still have to switch over the columns, though.

I’m currently working on a better solution which splits the data calls up into more meaningful methods and kind of binds the columns to specific parts of the data items in order to get a more row-centric approach: one row = one element, columns = element attributes. I hope this will get me out of this switch/case/if/else nightmare.

What do you think about it? I mean, is it just me, or is an API that forces you into crappy code just not so well done?

How would you solve this?

We are software tailors

Our company is called Softwareschneiderei (which is German for software tailoring). This name describes our intention to write bespoken software, software that fits people perfectly. Over time different additional metaphors from the tailor’s world came around: seams/tucks which describe places in software systems where cuts can be made and testing can be done. Tailoring is a craftsmanship so an apprenticeship model and the pride in our work exists.
This describes the mentoring and bespoken software development we do. But besides that we do a lot of bug fixing, improvement of existing software which was written by others and evaluation of other people’s code. Thanks to a piece from Jason Fried (thanx Jason!) those other parts fit perfectly into our vision as software tailors: we iron/press (fix bugs, improve the code), we trim and cut (remove bottlenecks and unwanted functionality or extend the software to use other systems) and we measure (analyze, inspect and evaluate systems).

CMake Builder Plugin Reloaded

A few months ago I set out to build my first hudson plugin. It was an interesting, sometimes difficult journey which came to a good end with the CMake Builder Plugin, a build tool which can be used to build cmake projects with hudson. The feature set of this first version was somewhat limited since I applied the scratch-my-own-itch approach – which by the time meant only support for GNU Make under Linux.

As expected, it wasn’t long until feature requests and enhancement suggestions came up in the comments of my corresponding blog post. So in order to make the plugin more widely useable I used our second  Open Source Love Day to add some nice little features.

Update: I used our latest OSLD to make the plugin behave in master/slave setups. Check it out!

Let’s take a walk through the configuration of version 1.0 :

Path to cmake executable

1. As in the first version you have to set the path to the cmake executable if it’s not already in the current PATH.

2. The build configuration starts as in the first version with Source Directory, Build Directory and Install Directory.

CMake Builder Configuration Page

3. The Build Type can now be selected more conveniently by a combo box.

4. If Clean Build is checked, the Build Dir gets deleted on every build

Advanced Configuration Page

5. The advanced configuration part starts with Makefile Generator parameter which can be used to utilize the corresponding cmake feature.

6. The next two parameters Make Command and Install Command can be used if make tools other than GNU Make should be used

7. Parameter Preload Script can be used to point to a suitable cmake pre-load script file. This gets added to the cmake call as parameter of the -C switch.

8. Other CMake Arguments can be used to set arbitrary additional cmake parameters.

The cmake call will then be build like this:

/path/to/cmake  \
   -C </path/to/preload/script/if/given   \
   -G <Makefile Generator>  \
   -DCMAKE_INSTALL_PREFIX=<Install Dir> \
   -DCMAKE_BUILD_TYPE=<Build Type>  \
   <Other CMake Args>  \
   <Source Dir>

After that, the given Make and Install Commands are used to build and install the project.

With all these new configuration elements, the CMake Builder Plugin should now be applicable in nearly every project context. If it is still not useable in your particular setting, please let me know. Needless to say, feedback of any kind is always appreciated.

Blog harvest: Metaprogramming in Ruby,Hudson builds IPhone apps, Git workflow, Podcasting Equipment and Marketing

harvest64
Four blog posts:

  • Python decorators in Ruby – You can do amazing things in a language like Ruby or Lisp with a decent meta programming facility, here a language feature to annotate methods which needed a syntax change in Python is build inside of Ruby without any change to the language spec.
  • How to automate your IPhone app builds with Hudson – Another domain in which the popular CI Hudson helps: building your IPhone apps.
  • A Git workflow for agile teams – As distributed version control systems get more and more attention and are used by more teams you have to think about your utilisation of them.
  • Podcasting Equipment Guide – A bit offtopic but interesting nonetheless: if you want to do your own podcasts which equipment is right for you.

and a video:

A more elegant way to HTTP Requests in Java

The support for sending and processing HTTP requests was always very basic in the JDK. There are many, many frameworks out there for sending requests and handling or parsing the response. But IMHO two stand out: HTTPClient for sending and HTMLUnit for handling. And since HTMLUnit uses HTTPClient under the hood the two are a perfect match.

This is an example HTTP Post:

HttpClient client = new HttpClient();
PostMethod post = new PostMethod(url);
for (Entry param : params.entrySet()) {
    post.setParameter(param.key, param.value);
}
try {
    return client.executeMethod(post);
} finally {
    post.releaseConnection();
}

and HTTP Get:

WebClient webClient = new WebClient();
return (HtmlPage) webClient.getPage(url);

Accessing the returned HTML via XPath is also very straightforward:

List roomDivs=(List)page.getByXPath("//div[contains(@class, 'room')]");
for (HtmlElement div:roomDivs) {
  rooms.add(
    new Room(this,
      ((HtmlElement) div.getByXPath(".//h2/a").get(0)).getTextContent(),
      div.getId())
  );
}

One last issue remains: HTTPClient caches its cookies but HTMLUnit creates a HTTPClient on its own. But if you override HttpWebConnection and give it your HTTPClient everything works smoothly:

public class HttpClientBackedWebConnection extends HttpWebConnection {
  private HttpClient client;

  public HttpClientBackedWebConnection(WebClient webClient,
      HttpClient client) {
    super(webClient);
    this.client = client;
  }

  @Override
  protected HttpClient getHttpClient() {
    return client;
  }
}

Just set your custom webconnection on your webclient:

webClient.setWebConnection(
  new HttpClientBackedWebConnection(webClient, client)
);