Lazy Initialization/evaluation can help you performance and memory-consumption-wise

One way to improve performance when working with many objects or large data structures is lazy initialization or evaluation. The concept is to defer expensive operations to the latest moment possible – ideally never. I want to show some examples how to use lazy techniques in java and give you pointers to other languages where it is even easier and part of the core language.

One use case was a large JTable showing hundreds of domain objects consisting of meta-data and measured values. Initially our domain objects held both types of data in memory even though only part of the meta data was displayed in the table. Building the table took several seconds and we were limited to showing a few hundred entries at once. After some analysis we changed our implementation to roughly look like this:

public class DomainObject {
  private final DataParser parser;
  private final Map<String, String> header = new HashMap<>();
  private final List<Data> data = new ArrayList<>();

  public DomainObject(DataParser aParser) {
    parser = aParser;
  }

  public String getHeaderField(String name) {
    // Here we lazily parse and fill the header map
    if (header.isEmpty()) {
      header.addAll(parser.header());
    }
    return header.get(name);
  }
  public Iterable<Data> getMeasurementValues() {
    // again lazy-load and parse the data
    if (data.isEmpty()) {
      data.addAll(parser.measurements());
    }
    return data;
  }
}

This improved both the time to display the entries and the number of entries we could handle significantly. All the data loading and parsing was only done when someone wanted the details of a measurement and double-clicked an entry.

A situation where you get lazy evaluation in Java out-of-the-box is in conditional statements:

// lazy and fast because the expensive operation will only execute when needed
if (aCondition() && expensiveOperation()) { ... }

// slow order (still lazy evaluated!)
if (expensiveOperation() && aCondition()) { ... }

Persistence frameworks like Hibernate often times default to lazy-loading because database access and data transmission is quite costly in general.

Most functional languages are built around lazy evaluation of and their concept of functions as first class citizens and isolated/minimized side-effects support lazyness very well. Scala as a hybrid OO/functional language introduces the lazy keyword to simplify typical java-style lazy initialization code like above to something like this:

public class DomainObject(parser: DataParser) {
  // evaluated on first access
  private lazy val header = { parser.header() }

  def getHeaderField(name : String) : String = {
    header.get(name).getOrElse("")
  }

  // evaluated on first access
  lazy val measurementValues : Iterable[Data] = {
    parser.measurements()
  }
}

Conclusion
Lazy evaluation is nothing new or revolutionary but a very useful tool when dealing with large datasets or slow resources. There may be many situations where you can use it to improve performance or user experience using it.
The downsides are a bit of implementation cost if language support is poor (like in Java) and some cases where the application can feel more responsive with precomputed/prefetched values when the user wants to see the details.

Aspects done right: Concerns

With aspects you cannot see (without sophisticated IDE support) which class has which aspects and which aspects are woven into the class when looking at its source. Here concerns (also called mixins or traits) come to the rescue.

The idea of encapsulating cross cutting concerns struck with me from the beginning but the implementation namely the aspects lacked clarity in my opinion. With aspects you cannot see (without sophisticated IDE support) which class has which aspects and which aspects are woven into the class when looking at its source. Here concerns (also called mixins or traits) come to the rescue. I know that aspects were invented to hide away details about which code is included and where but I find it confusing and hard to trace without tool support.

Take a look at an example in Ruby:

module Versionable
  extend ActiveSupport::Concern

  included do
    attr_accessor :version
  end
end

class Document
  include Versionable
end

Now Document has a field version and is_a?(Versionable) returns true. For clients it looks like the field version is in Document itself. So for clients of this class it is the same as:

class Document
  attr_accessor :version
end

Furthermore you can easily use the versionable concern in another class. This sounds like a great implementation of the separating of concerns principle but why isn’t everyone using it (besides being a standard for the upcoming Rails 4)? Well, some people are concerned with concerns (excuse the pun). As with every powerful feature you can shoot yourself in the foot. Let’s take a look at each problem.

  • Diamond problem aka multiple inheritance
  • Ruby has no multiple inheritance. Even when you include more than one module the modules are like superclasses for the message resolve order. Every include creates a new “superclass” above the including class. So the last include takes precedence.

  • Dependencies between concerns
  • You can have dependencies between different concerns like this concern needs another concern. ActiveSupport:Concerns handles these dependencies automatically.

  • Unforeseeable results
  • One last big problem with concerns is having side effects from combining two concerns. Take for an example two concerns which add a method with the same name. Including both of them renders one concern unusable. This cannot be solved technically but I also think this problem shows an underlying, more important cause. It could be because of poor naming. Or you did not separate these two concerns enough. As always tests can help to isolate and spot the problem. Also concerns should be tested in isolation and in integration.

Summary of the Schneide Dev Brunch at 2013-01-06

If you couldn’t attend the Schneide Dev Brunch in January 2013, here are the main topics we discussed neatly summarized.

brunch64-borderedYesterday, we held another Schneide Dev Brunch. The Dev Brunch is a regular brunch on a sunday, only that all attendees want to talk about software development and various other topics. If you bring a software-related topic along with your food, everyone has something to share. The brunch had less participants this time, but didn’t lack topics. Let’s have a look at the main topics we discussed:

Sharing code between projects

The first topic emerged from our initial general chatter. What’s a reasonable and praticable approach to share code between software entities (different projects, product editions, versions, etc.). We discussed at least three different solutions that are known to us in practice:

  • Main branch with customer forks: This was the easiest approach to explain. A product has a main branch where all the new features are committed to. Everytime a customer wants his version, a new branch is created from the most current version on the main branch. The customer may require some changes and a lot of bug fixes, but all of that is done on the customer’s branch. Sometimes, a critical bug fix is merged back into the main branch, but no change from the main branch is transferred to the customer’s branch ever. Basically, the customer version of the code is “frozen” in terms of features and updates. This works well in its context because the main branch already contains the software every customer wants and no customer wants to update to a version with more features – this would be another additional branch.
  • Big blob of conditionals: This approach needs a bit more explanation. Once, there was a software product ready to be sold. Every customer had some change requests and special requirements. All these changes and special-cases were added to the original code base, using customer IDs and a whole lot of if-else statements to separate the changes from each customer. All customers always get the same code, but their unique customer ID only passes the guard clauses that are required for them. All the changes of all the other customers are deactivated at runtime. With this approach, the union of all features is always represented in the source code.
  • Project-as-an-universe: This approach defines projects as little universes without intersection. Every project stands for its own and only shares code with other projects by means of copy and paste. When a new project is started, some subset of classes of another project is chosen as a starting point and transformed to fit the requirements. There is no “master universe” or main branch for the shared classes. The same class may evolve differently (and conflicting) in different projects. This approach probably isn’t suited for a software product, but is applied to individual projects with different requirements.

We are aware of and discussed even approaches, but not with the profound knowledge of several years first-hand experience. The term OSGi was often used as a reference in the discussion. We were able to exhibit the motivation, advantages and shortcomings of each approach. It was very interesting to see that even slightly different prerequisites may lead to fundamentally different solutions.

Book (p)review: Practical API Design

In the book “Practical API Design” by Jaroslav Tunach, the founder of the NetBeans Platform and initial “architect” of its API talks about his lessons learnt when evolving a substantial API for over ten years. The book begins with a theory on values and motivations for good API design. We get a primer why APIs are needed and essential for modern software development. We learn what are the essential characteristics of a good API. The most important message here is that a good API isn’t necessarily “beautiful”. This caused a bit of discussion among us, so that the topic strayed a bit from the review characteristic. Well, that’s what the Dev Brunch is for – we aren’t a lecture session. One interesting discussion trail led us to the aestethics in music theory.
But to give a summary on the first chapters of the book: Good stuff! Jaroslav Tunach makes some statements worthy of discussion, but he definitely knows what he’s talking about. Some insights were eye-openers or at least thought-provokers for our reader. If the rest of the book holds to the quality of the first chapters, then you shouldn’t hesitate to add it to your reading queue.

Effective electronic archive

One of our participants has developed a habit to archivate most things electronically. He already blogged about his experiences:

Both blog entries hold quite a lot of useful information. We discussed some possibilities to implement different archivation strategies. Evernote was mentioned often in the discussion, diigo was named as the better delicious, Remember The Milk as a task/notification service and Google Gmail as an example to rely solely on tags. Tags were a big topic in our discussion, too. It was mentioned that Confluence has the ability to add multiple tags to an article. Thunderbird was mentioned, especially in the combination of tags and virtual folders. And a noteworthy podcast of Scott Hanselmann on the topic of “Getting Things Done” was pointed out, too.

Schneide Events 2013

We performed a short survey about different special events and workshops that may happen in 2013 in the Softwareschneiderei. If you already are registered on our Dev Brunch list, you’ll receive the invitations for all events shortly. Here is a short primer on what we’re planning:

  • Communication Through Test workshop
  • Refactoring Golf
  • API Design Fest
  • Google Gruyere Day
  • Introduction to Dwarf Fortress

Some of these events are more related to software engineering than others, but all of them try to be fun first, lessons later. Participate if you are interested!

Learning programming languages

The last main topic of the brunch was a short, rather disappointed review of the book “Seven Languages in Seven Weeks” by Bruce Tate. The best part of the book, according to our reviewer, were the interview sections with the language designers. And because he got interested in this kind of approach to a programming language, he dug up some similar content:

The Computerworld interviews are directly accessible and contain some pearls of wisdom and humour (and some slight inaccuracies). Highly recommended reading if you want to know not only about the language, but also about the context (and mindset) in which it was created.

Epilogue

As usual, the Dev Brunch contained a lot more chatter and talk than listed here. The high number of attendees makes for an unique experience every time. We are looking forward to the next Dev Brunch at the Softwareschneiderei. And as always, we are open for guests and future regulars. Just drop us a notice and we’ll invite you over next time.

Innocent fun with net send

What happens when you bore a developer to the point when he begins to toy around with the net send tool? You’ll gain a valueable insight about the usefulness of message dialog boxes.

beamed-xmas-treeBecause it is the holiday season and most of us have a straining year of hard work behind us, let me just tell you a little story with more fun than moral in it. The story really happened, but the circumstances and details are altered to protect the innocent.

The setup

Imagine a full day of boring training workshops with a dozen developers in one room, each sitting behind a computer and trying to mimic the tiresome click orgy the instructor presents. Between the developers, there is one web-designer, clearly distinguishable by looks and questions. The workshops drags on and on, until Marvin, the protagonist of our story, loses interest in the clicking and sets out to explore the computer and its possibilities. This all happens a decade ago, when terminal servers were still new and fancy and windows was an open book for those who could read it. The computers were installed by the workshop host and used with a guest login.

The exploration

Marvin notices that the IP address of every computer was written on the computer case. It was just a matter of inconspicuous looks to gather the addresses of the neighbouring machines. The next step was to open a command shell – which was available without any tricks – and try to net send a message to his own machine. Net send was essentially a system service that listened to network messages on a specific port and displayed them as a dialog. So if you’d net send a message to a computer, it would be displayed in a message box in the middle of the screen on top of all active windows with a caption identifying the sender. The user had to acknowledge the dialog by clicking the button to be able to proceed in his original windows. In summary, net send was the perfect remote distraction tool. And it worked: Marvin was able to message himself with net send. The terminal server even disguised the real sender by sending the message with its address instead of the guest machine’s. Now Marvin could anonymously open modal message boxes with a custom message on every computer in the room, given that he knew its address. The workshop promised to be fun again.

The first reaction

After making up some witty messages, Marvin collected all his mental willpower to act indifferent while slowly typing the first message to his neighbour. It just read “Harddisk error” and was only a test drive if he was able to pull this prank without bursting out in laughter or being identified as the source. If he could message his neighbour without him noticing, he could message everybody in the room. After the net send command was complete, Marvin paused a bit and used his little finger to tap enter on the numerical block of the keyboard, to not draw attention to his keyboard pattern. As soon as the command was acknowledged, his neighbour let out a muffled groan and clicked the message box away without even reading it.

The messages

After that, the messages were longer and more sophisticated. After the first few messages, Marvin guessed the pattern in which the IP addresses were located in the room and sent messages to nearly everybody else attending the workshop. Some messages read “Virus found! Need to manually reboot the computer.”, others “Keyboard error. Press Enter to continue.” and the like. The reactions from the developers in front of the machines were always the same: A fretful sound and an acknowledging click without the slightest hesistation. Nobody rebooted or checked the keyboard. The messages were just dismissed and immediately forgotten like a temporary annoyance. Even when the message grew as long as two full sentences, the recipient just clicked it away.

The highlight

During a short recess, Marvin planned the ultimate net send attack: a message on the presenter’s computer, precisely timed to fit the workshop content. He went to the instructor and asked some question while memorizing the IP address of the machine that was connected to the beamer. If he sends a message to this computer, it would be shown on the beamer to the whole audience and the instructor. He used the remaining recess time to formulate the perfect message. The lectures began again, everybody took their seat and concentrated on the topic again. A few minutes into the workshop, Marvin hit enter and the message box appeared on the wall:

Attention! The beamer is overheating. Only a few minutes left before critical temperature level is reached and shutdown is forced to prevent damage.

Everybody stopped and gasped while they finally read a message. Probably only missing the dialog title that clearly stated that the message came from the terminal server, the sole non-developer in the room, the web-designer, asked the one and only legitimate question: “How can the beamer send this message to the computer over the VGA cable?”

Only a split-second later, a developer answered with “there is a standard for that”. Another one chimed in: “your computer also knows which monitor is attached through that cable”. A third suggested a solution: “We might just turn it off a few minutes to cool down. It’ll be okay afterwards.” Clearly out of his comfort zone, the instructor decided: “No, we just had a recess and I’m behind schedule. We’ll see how long this beamer bears with us.”

The moral of the story

Surprisingly, the beamer lasted the whole rest of the day and many days afterwards, without any further hickups. One attendee of the workshop silently laughed for about an hour and the day went by a lot faster. But the most surprising thing was that the only person that grasped the real marvel of the situation was the person with the least technical knowledge in the room. All the seasoned developers missed every clue that there was something fishy with a beamer communicating with the computer over a VGA cable and opening dialog boxes on the computer (and not just in-picture). And nobody reads the text in a dialog box ever. Especially not the title bar!

Postscriptum

Marvin wants to apologize to everybody he bothered during this workshop. It was a fun idea originating from boredom, but it turned into a fascinating techno-social experiment. He says he learnt a valueable lesson that day, even if he doesn’t remember any content of the training itself.

Thoughts about TDD

Thoughts and links about test driven development

First a disclaimer: I think tests are a hallmark for professional software development, I like to write tests before the implementation but that’s not always easy or simple (for the difference please refer to Simple made easy). I find it hard to grasp test driven development (TDD) though. The difference between test first and test driven lies in the intention: in both cases tests are written before any implementation code but in TDD the tests drive the design of your implementation.

The problem with opinions of TDD is there are mostly extreme positions: some think “TDD is the (next) holy grail” or the ones which dismissed it. Though reading between the lines there are great discussions about how to do it and what problems arise. Many people (me included) are really trying to get value from TDD. Testing should be fun.
One way in letting the tests drive the way you develop is proposed by Uncle Bob: transformation priority premise. He proposes a list of transformations which introduce new or replace existing constructs like replacing a constant by a variable or adding more logic and gives them a priority. Only if you cannot use a high priority transformation to get the test to pass you look at a transformation with a lower priority.
But how do you determine what you should test next or even which is the first test?
Taking the typical Conway’s game of life kata as an example one thing struck me: I could only get the TDD to work smoothly when I started with the data structure. But why that? Naturally I start with the algorithm (in this case the rules) and write the first test for it. But upon further inspection of the problem and deeper (domain) knowledge it seems the data structure is way more important for solving this kata. So you need to know where the journey goes along beforehand, not every step you will take but the big picture: first the data structure, then the rules in this example. Maybe you should start with the integrations or the functional tests and break them down into units.
What are your experiences using TDD? Do you use or want to use TDD?

An experiment about communication through tests

How effectively communicates our test code? We wanted to know if we were able to recreate a software from its tests alone. The experiment gained us some worthwile insights.

lrg-668-wuerfelRecently, we conducted a little experiment to determine our ability to communicate effectively by only using automatic tests. We wanted to know if the tests we write are sufficient to recreate the entire production code from them and understand the original requirements. We were inspired by a similar experiment performed by the Softwerkskammer Karlsruhe in July 2012.

The rules

We chose a “game master” and two teams of two developers each, named “Team A” and “Team B”. The game master secretly picked two coding exercises with comparable skill and effort and briefed every team to one of them. The other team shouldn’t know the original assignment beforehands, so the briefings were held in isolation. Then, the implementation phase began. The teams were instructed to write extensive tests, be it unit or integration tests, before or after the production code. The teams knew about the further utilization of the tests. After about two hours of implementation time, we stopped development and held a little recreation break. Then, the complete test code of each implementation was transferred to the other team, but all production code was kept back for comparison. So, Team A started with all tests of Team B and had to recreate the complete missing production code to fulfill the assignment of Team B without knowing exactly what it was. Team B had to do the same with the production code and assignment of Team A, using only their test code, too. After the “reengineering phase”, as we called it, we compared the solutions and discussed problems and impressions, essentially performing a retrospective on the experiment.

The assignments

The two coding exercises were taken from the Kata Catalogue and adapted to exhibit slightly different rules:

  • Compare Poker Hands: Given two hands of five poker cards, determine which hand has a higher rank and wins the round.
  • Automatic Yahtzee Player: Given five dice and our local Yahtzee rules, determine a strategy which dice should be rerolled.

There was no obligation to complete the exercise, only to develop from a reasonable starting point in a comprehensible direction. The code should be correct and compileable virtually all the time. The test coverage should be near to 100%, even if test driven development or test first wasn’t explicitely required. The emphasis of effort should be on the test code, not on the production code.

The implementation

Both teams understood the assignment immediately and had their “natural” way to develop the code. Programming language of choice was Java for both teams. The game master oscillated between the teams to answer minor questions and gather impressions. After about two hours, we decided to end the phase and stop coding with the next passing test. No team completed their assignment, but the resulting code was very similar in size and other key figures:

  • Team A: 217 lines production code, 198 lines test code. 5 production classes, 17 tests. Test coverage of 94,1%
  • Team B: 199 lines production code, 166 lines test code. 7 production classes, 17 tests. Test coverage of 94,1%

In summary, each team produced half a dozen production classes with a total of ~200 lines of code. 17 tests with a total of ~180 lines of code covered more than 90% of the production code.

The reengineering

After a short break, the teams started with all the test code of the other team, but no production code. The first step was to let the IDE create the missing classes and methods to get the tests to compile. Then, the teams chose basic unit tests to build up the initial production code base. This succeeded very quickly and turned a lot of tests to green. Both teams struggled later on when the tests (and production code) increased in complexity. Both teams introduced new classes to the codebase even when the tests didn’t suggest to do so. Both teams justified their decision with a “better code design” and “ease of implementation”. After about 90 minutes (and nearly simultaneous), both teams had implemented enough production code to turn all tests to green. Both teams were confident to understand the initial assignment and to have implemented a solution equal to the original production code base.

The examination

We gathered for the examination and found that both teams met their requirements: The recreated code bases were correct in terms of the original solution and the assignment. We have shown that communication through only test code is possible for us. But that wasn’t the deepest insight we got from the experiment. Here are a few insights we gathered during the retrospective:

  • Both teams had trouble to effectively distinguish between requirements from the assignment and implementation decisions made by the other team. The tests didn’t transport this aspect good enough. See an example below.
  • The recreated production code turned out to be slightly more precise and concise than the original code. This surprised us a bit and is a huge hint that test driven development, if applied with the “right state of mind”, might improve code quality (at least for this problem domain and these developers).
  • The classes that were introduced during the reengineering phase were present in the original code, too. They just didn’t explicitely show up in the test code.
  • The test code alone wasn’t really helpful in several cases, like:
    • Deciding if a class was/should be an Enum or a normal class
    • Figuring out the meaning of arguments with primitive values. A language with named parameter support would alleviate this problem. In Java, you might consider to use Code Squiggles if you want to prepare for this scenario.
  • The original team would greatly benefit from watching the reengineering team during their coding. The reengineering team would not benefit from interference by the original team. For a solution to this problem, see below.

The revelation

One revelation we can directly apply to our test code was how to help with the distinction between a requirement (“has to be this way”) and implementator’s choice (“incidentally is this way”). Let’s look at an example:

In the poker hands coding exercise, every card is represented by two characters, like “2D” for a two of diamonds or “AS” for an ace of spades. The encoding is straight-forward, except for the 10, it is represented by a “T” and not a “10”: “TH” is a ten of hearts. This is a requirement, the implementator cannot choose another encoding. The test for the encoding looks like this:


@Test
public void parseValueForSymbol() {
  assertEquals(Value._2, Value.forSymbol("2"));
  [...]
  assertEquals(Value._10, Value.forSymbol("T"));
  [...]
  assertEquals(Value.ACE, Value.forSymbol("A"));
}

If you write the test like this, there is a clear definition of the encoding, but not of the underlying decision for it. Let’s rewrite the test to communicate that the “T” for ten isn’t an arbitrary choice:


@Test
public void parseValueForSymbol() {
  assertEquals(Value._2, Value.forSymbol("2"));
  [...]
  assertEquals(Value.ACE, Value.forSymbol("A"));
}

@Test
public void tenIsRequiredToBeRepresentedByT() {
  assertEquals(Value._10, Value.forSymbol("T"));
}

Just by extracting this encoding to a special test case, you emphasize that you are aware of the “inconsistency”. By the test name, you state that it wasn’t your choice to encode it this way.

The improvement

We definitely want to repeat this experiment again in the future, but with some improvements. One would be that the reengineering phases should be recorded with a screencast software to be able to watch the steps in detail and listen to the discussions without the possibility to interact or influence. Both original teams had great interest in the details of the recreation process and the problems with their tests. The other improvement might be an easing on the time axis, as with the recorded implementation phases, there would be no need for a direct observation by a game master or even a concurrent performance. The tasks could be bigger and a bit more relaxed.

In short: It was fun, challenging, informative and reaffirming. A great experience!

Web apps: Security is more than you think

Security in web apps is an ever increasing important topic: in this post we take a look at injection attacks especially SQL injection, the number one OWASP security problem.

Security in web apps is an ever increasing important topic besides securing the machine or your web/application containers on which your apps run you need to deal with some security related issues in your own apps. In this article we take a look at the number one (according to OWASP)risk in web apps:

Injection attacks

Every web app takes some kind of user input (usually through web forms) and works with it. If the web app does not properly handle the user input malicious entries can lead to severe problems like stealing or losing of data. But how do you identify problems in your code? Take a look at a naive but not uncommon implementation of a SQL query:

query("select * from user_data where username='" + username + "'")

Using the input of the user directly in a query like this is devastating, examples include dropping tables or changing data. Even if your library prevents you from using more than one statement in a query you can change this query to return other users’ data.
Blacklisting special characters is not a solution since you need some of them in your input or there are methods to circumvent your blacklists.
The solution here is to proper escape your input using your libraries mechanisms (e.g. with Groovy SpringJDBC):

query("select * from user_data where username=:username", [username: username])

But even when you escape everything you need to take care what you inject in your query. In this example all data is stored with a key of username.data.

query("select * from user_data where key like :username '.%' ", [username: username])

In this case everything will be escaped correctly but what happens when your user names himself % ? He gets the data of all users.

Is SQL the only vulnerable part of your app? No, every part which interprets your input and executes it is vulnerable. Examples include shell commands or JavaScript which we will look at in a future blog post.

As the last query showed: besides using proper escaping, setting your mind for security problems is the first and foremost step to a secure app.

A small test saves the day

You think a method is too trivial to write a test for it? Think again if the method is mission-critical!

Just recently, I had to write a connection between an existing application and a new hardware unit. This is a fairly common job for our company, even considering the circumstances that I’d never even seen the hardware, let alone being able to connect to it. The hardware unit itself was rather big and it was installed in a security sensitive area with restricted access. So, I only got a specification of the protocol to use and a description of the hardware’s features.

Our common procedure to include hardware dependent modules into an application is to write two implementations of the module: One implementation is the real deal and interacts with the hardware over ethernet, USB, serial port or whatever proprietary communication device is used. This version of the module can only work as intended if the hardware is present. The other implementation acts as an emulation of the hardware, without any dependencies. If you are familiar with unit tests, think of a big test mock. The emulation version is used during development to test and run the application without requirements about the hardware. There are a lot of subtle pitfalls to consider and avoid, but on a bird-view level of abstraction, these interchangeable implementations of a module enable us to develop software with hardware dependencies without need for the actual hardware.

The first piece of code that’s used of a module is a factory/builder class that chooses between the available implementations, based on some configuration entry (or hardware availability, etc.). A typical implementation of the responsible method might look like this:


public HardwareModule createFor(ModuleConfiguration configuration) {
  if (configuration.isHardwarePresent()) {
    new RealHardwareModule();
  }
  return new EmulatedHardwareModule();
}

If the configuration object says that the hardware is present, the real implementation is used, subsequentially opening a connection to the hardware and talking the client side of the given protocol. Otherwise, the emulation is created and returned, maybe opening a debug GUI window to display certain internal states and values and providing controls to mess with the application during development.

The method itself looks very innocent and meager. There is not much going on, so what could possibly go wrong?

I’m not the most eager test-driven developer in the world, I have to admit. But I see the value of tests (and unit tests in particular) and adhere to the A-TRIP rules defined by Andy Hunt and (pragmatic) Dave Thomas:

  • Automatic
  • Thorough
  • Repeatable
  • Independent
  • Professional

For a complete definition of the rules, read the linked blog entry or, even better, buy the book. It’s small and cheap, but contains a lot of profound basic knowledge about unit testing.

The “Thorough” rule is more of a rule of thumb than a hard scientific formula for good unit tests: Always write a test if you’ve found a bug or if the code you’re writing is mission-critical. This was when my gut feeling told me that while the method above might seem trivial, it is definitely essential for the hardware module. So I wrote a test:

  @Test
  public void providesEmulationIfUnspecified() {
    HardwareModuleFactory factory = new HardwareModuleFactory();
    HardwareModule hardware = factory.createFor(configuration(""));
    assertEquals("not the hardware emulation", EmulatedHardwareModule.class, hardware.getClass());
  }

  @Test
  public void providesEmulationIfHardwareAbsent() {
    HardwareModuleFactory factory = new HardwareModuleFactory();
    HardwareModule hardware = factory.createFor(configuration("hardware.present=false"));
    assertEquals("not the hardware emulation", EmulatedHardwareModule.class, hardware.getClass());
  }

  @Test
  public void providesRealImplementationIfHardwarePresent() {
    HardwareModuleFactory factory = new HardwareModuleFactory();
    HardwareModule hardware = factory.createFor(configuration("hardware.present=true"));
    assertEquals("not the real hardware implementation", RealHardwareModule.class, hardware.getClass());
  }

To my surprise, the test immediately went red for the third test method. After double-checking the test code, I was certain that the test was correct. The test discovered a bug in the production code. And being a mostly independent unit test, it pointed to the problematic lines right away: the method implementation above. The helper method named configuration() spared in the code sample was very unlikely to contain a bug.

After a short moment of reading the code again, I corrected it (note the added return statement in line 3):


public HardwareModule createFor(ModuleConfiguration configuration) {
  if (configuration.isHardwarePresent()) {
    return new RealHardwareModule();
  }
  return new EmulatedHardwareModule();
}

This might not seem like the most disastrous bug ever, but it would have made for a nasty start when I finally would have tried the application with the real hardware. There is nothing more valueable than to be able to keep your cool “in the wild” and work on the real problems like faulty protocol specifications or unexpected/undocumented hardware behaviour. So, my gut feeling (and the Thorough rule) were right and my brain, telling me “skip this petty test” longer than I like to admit, was wrong. A small test for a small method paid off immediately and saved the day, at least for me.

Testing on .NET: Choosing NUnit over MSTest

We sometimes do smaller .NET projects for our clients even though we are mostly a Java/JVM shop. Our key infrastructure stays the same for all projects – regardless of the platform. That means the .NET projects get integrated into our existing continuous integration (CI) infrastructure based on Jenkins. This works suprisingly well even though you need a windows slave and the MSBuild plugin.

One point you should think about is which testing framework to use. MSTest is part of Visual Studio and provides nice integration into the IDE. Using it in conjunction with Jenkins is possible since there is a MSTest plugin for our favorite CI server. One downside is that you need either Visual Studio itself or the Windows SDK (500MB download, 300MB install) installed on the build server in addition to .NET. Another one is that it does not work with the “Express” editions of Visual Studio. Usually that is not a problem for companies but it raises the entry barrier for open source or other non-profit projects by requiring relatively expensive Visual Studio licences.

In our scenarios NUnit proved much lighter and friendlier in installation and usage. You can easily bundle it with your sources to improve self-containment of the project and lessen the burden on the system and tools. If you plug the NUnit tool into the external tools-section of Visual Studio (which also works with Express) the integration is acceptable, too.

Conclusion

If you are not completely on the full Microsoft stack for you project infrastructure using Visual Studio, TeamCity, Sourcesafe et al. it is worth considering choosing NUnit over MSTest because of its leaner size and looser coupling to the Mircosoft stack.

Antipatterns: Convenience Constructors

Lately I stumble a lot upon code I wrote 4 or more years ago. In the light of introducing new features the code gets tested for its quality. One antipattern I’ve found which I had used in the past but which is really hard to extend is convenience constructors.

Lately I stumble a lot upon code I wrote 4 or more years ago. In the light of introducing new features the code gets tested for its quality. One antipattern I’ve found which I had used in the past but which is really hard to extend is convenience constructors. Take a constructor for a command object for example:

    public SetProperty(String filename, String key, String value) {
        this(filename, key, value, null);
    }

    public SetProperty(String filename,
            String key, String value, String comment) {
        this(filename, ReferenceTo.key(key), value, comment);
    }

    public SetProperty(String filename,
            String sectionType, String sectionName,
            String key, String value) {
        this(filename, sectionType, sectionName, key, value, null);
    }

    public SetProperty(String filename,
            String sectionType, String sectionName,
            String key, String value, String comment) {
        this(filename, ReferenceTo.sectionAndKey(sectionType, sectionName, key), value, comment);
    }

    public SetProperty(String filename,
            AdvancedPropertyReference propertyReference,
            String value, String comment) {
        this(filename, propertyReference, value, comment);
    }

    public SetProperty(String filename,
            AdvancedPropertyReference propertyReference,
            String value, String comment) {
        super(filename);
        this.propertyReference = propertyReference;
        this.value = value;
        this.comment = comment;
    }

We need to add a new feature which enables us to append properties not just set and replace them. One way could be to extend the class. But this is overkill. Just adding a new parameter flag should suffice. But this would blow up the number of constructors because you need to include a version with and without the new parameter for each (used) constructor. Here an old friend comes to the rescue: design patterns. Looking at the GoF book shows a good solution to the problem: the builder pattern.

public class SetPropertyBuilder {
    private final String filename;
    private String sectionType;
    private String sectionName;
    private String referenceKey;
    private String value;
    private String comment;
    private boolean append;

    public SetPropertyBuilder(String filename) {
        super();
        this.filename = filename;
    }

    public SetPropertyBuilder set(String key, String newValue) {
        this.referenceKey = key;
        this.value = newValue;
        return this;
    }

    public SetPropertyBuilder append(String key, String additionalValue) {
        set(key, additionalValue);
        this.append = true;
        return this;
    }

    public SetPropertyBuilder inSection(String type, String name) {
        this.sectionType = type;
        this.sectionName = name;
        return this;
    }

    public SetProperty build() {
        AdvancedPropertyReference reference = ReferenceTo.key(this.referenceKey);
        if (this.sectionType != null && this.sectionName != null) {
            reference = ReferenceTo.sectionAndKey(this.sectionType, this.sectionName, this.referenceKey);
        }
        return new SetProperty(this.filename, reference, this.value, this.comment, this.append);
    }
}

Now we can eleminate all but one constructor from the SetProperty command. Adding a new property now yields one new method in the builder.