From ugly to pretty – Three steps is all it takes

A story about what can happen if you challenge your students to improve inferior code. With just three simple steps, the code gets beautiful.

makeupI hold lectures in software engineering for over a decade now. One major topic is testing, specifically unit tests. Other corner stones are refactorings and code readability. So whenever I have the chance to challenge my students in cross-topic aspects of software development, it’s almost always a source of insight for them and especially for me. But one golden moment holds a special place in my memory. This is the (rather elaborate, sorry) story of this moment.

During a lecture about unit tests with JUnit, my students had the task to develop tests for a bank account class. That’s about as boring as testing can be – the account was related to a customer and had a current balance. The customer can withdraw money, but only some customers can overdraw their account. To spice things up a bit, we also added the mock object framework EasyMock to the mix. While I would recommend other mock frameworks for production usage, the learning curve of EasyMock is just about right for first time exposure in a “sheep dip” fashion.

Our first test dealt with drawing money from an empty account that can be overdrawn:

@Test
public void canWithdrawOnCredit() {
  Customer customer = EasyMock.createMock(Customer.class);
  EasyMock.expect(customer.canOverdraw()).andReturn(true);
  EasyMock.replay(customer);
  Account account = new Account(customer);
  Euro required = new Euro(30);

  Euro cash = account.withdraw(required);

  assertEquals(new Euro(30), cash);
  assertEquals(new Euro(-30), account.balance());
  EasyMock.verify(customer);
}

The second test made sure that this withdrawal behaviour only works for customers with sufficient credit standing. We decided to pay out nothing (0 Euro) if the customer tries to withdraw more money than his account currently holds:

@Test
public void cannotTakeUpCredit() {
  Customer customer = EasyMock.createMock(Customer.class);
  EasyMock.expect(customer.canOverdraw()).andReturn(false);
  EasyMock.replay(customer);
  Account account = new Account(customer);
  Euro required = new Euro(30);

  Euro cash = account.withdraw(required);

  assertEquals(Euro.ZERO, cash);
  assertEquals(Euro.ZERO, account.balance());
  EasyMock.verify(customer);
}

As you can tell, a lot of copy and paste was going on in the creation of this test. Just look at the name of the local variable “required” – it’s misleading now. Right up to this point, my main topic was the usage of the mock framework, not perfect code. So I explained the five stages of normalized mock-based unit tests (initialize, train mocks, execute tested code, assert results, verify mocks) and then changed the topic by expressing my displeasure about the duplication and the inferior readability of the code (it even tries to trick you with the “required” variable!). Now it was up to my students to improve our situation (this trick works only a few times for every course before they preventively become even pickier than me). A student accepted the challenge and gave advice:

First step: Extract Method refactoring

The obvious first step was to extract the duplication in its own method and adjust the calls by their parameters. This is an easy refactoring that will almost always improve the situation. Let’s see where it got us. Here is the extracted method:

protected void performWithdrawalTestWith(
    boolean customerCanOverdraw,
    Euro amountOfWithdrawal,
    Euro expectedCash,
    Euro expectedBalance) {
  Customer customer = EasyMock.createMock(Customer.class);
  EasyMock.expect(customer.canOverdraw()).andReturn(customerCanOverdraw);
  EasyMock.replay(customer);
  Account account = new Account(customer);

  Euro cash = account.withdraw(amountOfWithdrawal);

  assertEquals(expectedCash, cash);
  assertEquals(expectedBalance, customer.balance());
  EasyMock.verify(customer);
}

And the two tests, now really concise:

@Test
public void canWithdrawOnCredit() {
  performWithdrawalTestWith(
      true,
      new Euro(30),
      new Euro(30),
      new Euro(-30));
}

 

@Test
public void cannotTakeUpCredit() {
  performWithdrawalTestWith(
      false,
      new Euro(30),
      Euro.ZERO,
      Euro.ZERO);
}

Well, that did resolve the duplication indeed. But the test methods now lacked any readability. They appeared as if somebody had extracted all the semantics out of the code. We were unhappy, but decided to interpret the current code as an intermediate step to the second refactoring:

Second step: Introduce Explaining Variable refactoring

In the second step, the task was to re-introduce the semantics back into the test methods. All parameters were nameless, so that was our angle of attack. By introducing local variables, we gave the parameters meaning again:

@Test
public void canWithdrawOnCredit() {
  boolean canOverdraw = true;
  Euro amountOfWithdrawal = new Euro(30);
  Euro payout = new Euro(30);
  Euro resultingBalance = new Euro(-30);

  performWithdrawalTestWith(
      canOverdraw,
      amountOfWithdrawal,
      payout,
      resultingBalance);
}

 

@Test
public void cannotTakeUpCredit() {
  boolean canOverdraw = false;
  Euro amountOfWithdrawal = new Euro(30);
  Euro payout = Euro.ZERO;
  Euro resultingBalance = Euro.ZERO;

  performWithdrawalTestWith(
      canOverdraw,
      amountOfWithdrawal,
      payout,
      resultingBalance);
}

That brought back the meaning to the test methods, but didn’t improve readability. The code wasn’t intentionally cryptic any more, but still far from being intuitively understandable – and that’s what really readable code should be. If even novices can read your code fluently and grasp the main concepts in the first pass, you’ve created expert code. I challenged the student to further transform the code, without any idea how to carry on myself. My student hesitated, but came up with the decisive refactoring within seconds:

Third step: Rename Variable refactoring

The third step doesn’t change the structure of the code, but its approachability. Instead of naming the local variables after their usage in the extracted method, we name them after their purpose in the test method. A first time reader won’t know about the extracted method (and preferably shouldn’t need to know), so it’s not in the best interest of the reader to foreshadow its details. Instead, we concentrate about telling the reader a coherent story:

@Test
public void canWithdrawOnCredit() {
  boolean aCustomerThatCanOverdraw = true;
  Euro heWithdraws30Euro = new Euro(30);
  Euro receivesTheFullAmount = new Euro(30);
  Euro andIsNow30EuroInTheRed = new Euro(-30);

  performWithdrawalTestWith(
      aCustomerThatCanOverdraw,
      heWithdraws30Euro,
      receivesTheFullAmount,
      andIsNow30EuroInTheRed);
}

 

@Test
public void cannotTakeUpCredit() {
  boolean aCustomerThatCannotOverdraw = false;
  Euro heTriesToWithdraw30Euro = new Euro(30);
  Euro butReceivesNothing = Euro.ZERO;
  Euro andStillHasABalanceOfZero = Euro.ZERO;

  performWithdrawalTestWith(
      aCustomerThatCannotOverdraw,
      heTriesToWithdraw30Euro,
      butReceivesNothing,
      andStillHasABalanceOfZero);
}

If the reader is able to ignore some crude verbalization and special characters, he can read the test out loud and instantly grasp its meaning. The first lines of every test method are a bit confusing, but necessary given Java’s lack of named parameters.

The result might remind you a lot of Behavior Driven Development notation and that’s probably not by chance. In a few minutes during that programming exercise, my students taught themselves to think in scenarios or stories when approaching unit tests. I couldn’t have taught it any better – instead, I got enlightened by this exercise, too.

Object Calisthenics: Change the way you think

Some time ago I spoke with my colleague about skill sharpening and training the brain to come up with new solutions. He proposed a two hour session at the weekend implementing a small game using object calisthenics.

Rules

The rules are described in The ThoughtWorks Anthology book. Here is the list for quick reference.

  1. Use only one level of indentation per method.
  2. Don’t use the else keyword.
  3. Wrap all primitives and strings.
  4. Use only one dot per line.
  5. Don’t abbreviate.
  6. Keep all entities small.
  7. Don’use any classes with more than two instance variables.
  8. Use first-class collections.
  9. Don’t use any getters/setters/properties.

Most of the rules seemed simple enough. Rules 2 and 5 are standard in Softwareschneiderei, 1, 4, 6 and 8 are stricter versions of common sense, 3 is a tedious object wrapping. The rules I was anxious about were 7 and 9. To increase the learning effect, I added an extra rule to the list that is critical in real life programming:

  1.   Write tests for your code.

It doesn’t matter whether to write test first, test after or even test driven. Only then is the code “value added”.

Experiences

The game was minesweeper. It contains a nice mix of algorithms, data structures and UI. I concentrated the efforts on the algorithmic part. My first step was to analyse and create the needed data structures.

  • The smallest unit is the cell.
  • A cell can be either hidden or revealed, have a mine or be empty.
  • The game field contains such cells in rows and columns.
  • The position of a cell in a field is defined by its coordinate that contains the x and y position.

To associate anything with coordinates the coordinates had to be comparable to each other. Rule 9 forbids exposure of internal state, so the Coordinate class got its equals() and hashCode(). Only the creator of the coordinate had the knowledge about the number of dimensions and the values of the positions. Even the tests had no access to the inner state and tested only those two methods.

Since the revealed flag concept and a mine flag concept had similar properties, I decided not to track cells but to track their flags. Through this architectural decision, I had a field with two flag containers, one for revealed cells and one for cells with mines. An additional benefit was that it was enough to put only the coordinate into the container to mark a cell as a mine.

The next step was to link the parts together and add some behaviour. Setting a mine, then revealing a cell and obtaining the number of mines also. Setting a mine and marking the cell as revealed is a simple task with the containers. Testing that the revealed cell contained the mine was more tricky. To achieve that, the reveal method got an additional parameter, a closure with a hasMine parameter.

public void reveal(final Coordinate coordinate, final CellContainerVisitor revealedCellsVisitor) {
    revealedCells.mark(coordinate);
    visit(coordinate, revealedCellsVisitor);
}

private void visit(final Coordinate coordinate, final CellContainerVisitor revealedCellsVisitor) {
    revealedCellsVisitor.visit(coordinate, hasMineAt(coordinate));
}

@Test
public void containsMines() {
    final CellContainer target = new CellContainer();
    target.placeMineAt(someCoordinate());

    final List<Coordinate> mineCells = new ArrayList<Coordinate>();
    target.reveal(someCoordinate(), (coordinate, hasMine) -> {
        if (hasMine.equals(new HasMine(true))) {
           mineCells.add(coordinate);
        }
    });

    assertThat(mineCells, hasSize(1));
    assertThat(mineCells, contains(someCoordinate()));
}

The next game rule consumed the rest of the session: calculating the number of mines in the neighborhood. The main obstacle was to compute the coordinate of the neighbour. To do this it is necessary to add an offset to a position in a coordinate without exposing its internal structure. In the end I reverted to using more closures.

Conclusion

To achieve my goal I had to reverse the order in which I normally develop business logic: Rule 9 seems to support top-down approach: The interfaces of domain objects are nearly completely dominated by the way they are used by their containers.

Most of the time in this two hour session was spent staring at the screen and to think how to write readable code and readable tests without exposing internal details of the objects. Time well spent.

Designing an API? Good luck!

An API Design Fest is a great opportunity to gather lasting insights what API design is really about. And it will remind you why there are so few non-disappointing APIs out there.

If you’ve developed software to some extent, you’ve probably used dozens if not hundreds of APIs, so called Application Programming Interfaces. In short, APIs are the visible part of a library or framework that you include into your project. In reality, the last sentence is a complete lie. Every programmer at some point got bitten by some obscure behavioural change in a library that wasn’t announced in the interface (or at least the change log of the project). There’s a lot more to developing and maintaining a proper API than keeping the interface signatures stable.

A book about API design

practicalapidesignA good book to start exploring the deeper meanings of API development is “Practical API Design” by Jaroslav Tulach, founder of the NetBeans project. Its subtitle is “Confessions of a Java Framework Architect” and it holds up to the content. There are quite some confessions to make if you develop relevant APIs for several years. In the book, a game is outlined to effectively teach API design. It’s called the API Design Fest and sounds like a lot of fun.

The API Design Fest

An API Design Fest consists of three phases:

  • In the first phase, all teams are assigned the same task. They have to develop a rather simple library with an usable API, but are informed that it will “evolve” in a way not quite clear in the future. The resulting code of this phase is kept and tagged for later inspection.
  • The second phase begins with the revelation of the additional use case for the library. Now the task is to include the new requirement into the existing API without breaking previous functionality. The resulting code is kept and tagged, too.
  • The third phase is the crucial one: The teams are provided with the results of all other teams and have to write a test that works with the implementation of the first phase, but breaks if linked to the implementation of the second phase, thus pointing out an API breach.

The team that manages to deliver an unbreakable implementation wins. Alternatively, points are assigned for every breach a team can come up with.

The event

This sounds like too much fun to pass it without trying it out. So, a few weeks ago, we held an API Design Fest at the Softwareschneiderei. The game mechanics require a prepared moderator that cannot participate and at least two teams to have something to break in the third phase. We tried to cram the whole event into one day of 8 hours, which proved to be quite exhausting.

In a short introduction to the fundamental principles of API design that can withstand requirement evolution, we summarized five rules to avoid the most common mistakes:

  •  No elegance: Most developers are obsessed with the concept of elegance. In API design, there is no such thing as beauty in the sense of elegance, only beauty in the sense of evolvability.
  •  API includes everything that an user depends on: Your API isn’t defined by you, it’s defined by your users. Everything they rely on is a fixed fact, if you like it or not. Be careful about leaky abstractions.
  •  Never expose more than you need to: Design your API for specific use cases. Support those use cases, but don’t bother to support anything else. Every additional item the user can get a hold on is essentially accidental complexity and will sabotage your evolution attempts.
  •  Make exposed classes final and their constructor private: That’s right. Lock your users out of your class hierarchies and implementations. They should only use the types you explicitly grant them.
  •  Extendable types cannot be enhanced: The danger of inheritance in API design is that you suddenly have to support the whole class contract instead of “only” the interface/protocol contract. Read about Liskov’s Substitution Principle if you need a hint why this is a major hindrance.

The introduction closed with the motto of the day: “Good judgement comes from experience. Experience comes from bad judgement.” The API Design Fest day was dedicated to bad judgement. Then, the first phase started.

The first phase

No team had problems to grasp the assignment or to find a feasible approach. But soon, eager discussions started as the team projected the breakability of their current design. It was very interesting to listen to their reasoning.

After two hours, the first phase ended with complete implementations of the simple use cases. All teams were confident to be prepared for the extensions that would happen now. But as soon as the moderator revealed the additional use cases for the API, they went quiet and anxious. Nobody saw this new requirement coming. That’s a very clever move by Jaroslav Tulach: The second assignment resembles a real world scenario in the very best manner. It’s a nightmare change for every serious implementation of the first phase.

The second phase

But the teams accepted the new assignment and went to work, expanding their implementation to their best effort. The discussions revolved around possible breaches with every attempt to change the code. The burden of an API already existing was palpable even for bystanders.

After another two hours of paranoid and frantic development, all teams had a second version of their implementation and we gathered for a retrospective.

The retrospective

In this discussion, all teams laid down arms and confessed that they had already broken their API with simple means and would accept defeat. So we called off the third phase and prolonged the discussion about our insights from the two phases. What a result: everybody was a winner that day, no losers!

Some of our insights were:

  • Users as opponents: While designing an API, you tend to think about your users as friends that you grant a wish (a valid use case). During the API Design Fest, the developers feared the other teams as “malicious” users and tried to anticipate their attack vectors. This led to the rejection of a lot of design choices simply because “they could exploit that”. To a certain degree, this attitude is probably healthy when designing a real API.
  • Enum is a dead end: Most teams used Java Enums in their implementation. Every team discovered in the second phase that Enums are a dead end in regard of design evolution. It’s probably a good choice to thin out their usage in an API context.
  • The most helpful concepts were interfaces and factories.
  • If some object needs to be handed over to the user, make it immutable.
  • Use all the modifiers! No, really. During the event, Java’s package protected modifier experienced a glorious revival. When designing an API, you need to know about all your possibilities to express restrictions.
  • Forbid everything: But despite your enhanced expressibility, it’s a safe bet to disable every use case you don’t want to support, if only to minimize the target area for other teams during an API Design Fest.

The result

The API Design Fest was a great way to learn about the most obvious problems and pitfalls of API design in the shortest possible manner. It was fun and exhausting, but also a great motivator to read the whole book (again). Thanks to Jaroslav Tulach for his great idea.

Solutions to common Java enum problems

More readable solutions to using enums with attributes for categorization or representation.

Say, you have an enum representing a state:

enum State {
  A, B, C, D;
}

And you want to know if a state is a final state. In our example C and D should be final.
An initial attempt might be to use a simple method:

public boolean isFinal() {
	return State.C == this || State.D == this;
}

When there are two states this might seem reasonable but adding more states to this condition makes it unreadable pretty fast.
So why not use the enum hierarchy?

A(false), B(false), C(true), D(true);

private boolean isFinal;

private State(boolean isFinal) {
  this.isFinal = isFinal;
}

public boolean isFinal() {
  return isFinal;
}

This was and is in some cases a good approach but also gets cumbersome if you have more than one attribute in your constructor.
Another attempt I’ve seen:

public boolean isFinal() {
        for (State finalState : State.getFinalStates()) {
            if (this == finalState) {
                return true;
            }
        }
        return false;
    }

    public static List<State> getFinalStates() {
        List<State> finalStates = new ArrayList<State>();
        finalStates.add(State.C);
        finalStates.add(State.D);
        return finalStates;
    }

This code gets one thing right: the separation of the final attribute from the states. But it can be written in a clearer way:

List<State> FINAL_STATES = Arrays.asList(C, D)

public boolean isFinal() {
	return FINAL_STATES.contains(this);
}

Another common problem with enums is constructing them via an external representation, e.g. a text.
The classic dispatch looks like this:

    public static State createFrom(String text) {
        if ("A".equals(text) || "FIRST".equals(text)) {
            return State.A;
        } else if ("B".equals(text)) {
            return State.B;
        } else if ("C".equals(text)) {
            return State.C;
        } else if ("D".equals(text) || "LAST".equals(text)) {
            return State.D;
        } else {
            throw new IllegalArgumentException("Invalid state: " + text);
        }
    }

Readers of refactoring sense a code smell here and promptly want to refactor to a dispatch using the hierarchy.

A("A", "FIRST"),
B("B"),
C("C"),
D("D", "LAST");

private List<String> representations;

private State(String... representations) {
  this.representations = Arrays.asList(representations);
}

public static State createFrom(String text) {
  for (State state : values()) {
    if (state.representations.contains(text)) {
      return state;
    }
  }
  throw new IllegalArgumentException("Invalid state: " + text);
}

Much better.

Class names with verbs enforce the Single Responsibility Principle (SRP)

When using fluent code and fluent interfaces, I noticed an increased flexibility in the code. On closer inspection, this is the effect of a well-known principle that is inherently enforced by the coding style.

I’m experimenting with fluent code for a while now. Fluent code is code that everybody can read out loud and understand immediately. I’ve blogged on this topic already and it’s not big news, but I’ve just recently had a revelation why this particular style of programming works so well in terms of code design.

The basics

I don’t expect you to read all my old blog entries on fluent code or to know anything about fluent interfaces, so I’m giving you a little introduction.

Let’s assume that you want to find all invoice documents inside a given directory tree. A fluent line of code reads like this:


Iterable<Invoice> invoices = FindLetters.ofType(
    AllInvoices.ofYear("2012")).beneath(
        Directory.at("/data/documents"));

While this is very readable, it’s also a bit unusual for a programmer without prior exposure to this style. But if you are used to it, the style works wonders. Let’s see: the implementation of the FindLetters class looks like this (don’t mind all the generic stuff going on, concentrate on the methods!):

public final class FindLetters<L extends Letter> {
  private final LetterType<L> parser;

  private FindLetters(LetterType<L> type) {
    this.parser = type;
  }

  public static <L extends Letter> FindLetters<L> ofType(LetterType<L> type) {
    return new FindLetters<L>(type);
  }

  public Iterable<L> beneath(Directory directory) {
    ...
  }

Note: If you are familiar with fluent interfaces, then you will immediately notice that this isn’t even a full-fledged one. It’s more of a (class-level) factory method and a single instance method.

If you can get used to type in what you want to do as the class name first (and forget about constructors for a while), the code completion functionality of your IDE will guide you through the rest: The only public static method available in the FindLetters class is ofType(), which happens to return an instance of FindLetters, where again the only method available is the beneath() method. One thing leads to another and you’ll end up with exactly the Iterable of Invoices you wanted to find.

To assemble all parts in the example, you’ll need to know that Invoice is a subtype of Letter and AllInvoices is a subtype of LetterType<Invoice>.

The magical part

One thing that always surprised me when programming in this style is how everything seems to find its place in a natural manner. The different parts fit together really well, especially when the fluent line of code is written first. Of course, because you’ll design your classes to make everything fitting. And that’s when I had the revelation. In hindsight, it seems rather obvious to me (a common occurrence with revelations) and you’ve probably already seen it yourself.

The revelation

It struck me that all the pieces that you assemble a fluent line of code with are small and single-purposed (other descriptions would be “focussed”, “opinionated” or “determined”). Well, if you obey the Single Responsibility Principle (SRP), every class should only have one responsibility and therefore only limited purposes. But now I know how these two things are related: You can only cram so much purpose (and responsibility) in a class named FindLetters. When the class name contains the action (verb) and the subject (noun), the purpose is very much set. The only thing that can be adjusted is the context of the action on the subject, a task where fluent interfaces excel at. The main reason to use a fluent interface is to change distinct aspects of the context of an object without losing track of the object itself.

The conclusion

If the action+subject class names enforce the Single Responsibility Principle, then it’s no wonder that the resulting code is very flexible in terms of changing requirements. The flexibility isn’t a result of the fluency or the style itself (as I initially thought), but an effect predicted and caused by the SRP. Realizing that doesn’t invalidate the other positive effects of fluent code for me, but makes it a bit less magical. Which isn’t a bad thing.

Readable Code Needs Time and Care

A few weeks ago I was about to write an acceptance test involving socket communication. Since I was only interested in a particular sequence of exchanged data, I needed to wait for the start command and ignore all information sent prior to that command. In this blog post I’d like to present the process of enhancing the readability of the tiny piece of code responsible for this task.

The first version, written without thinking much about readability looked something like the following:

private void waitForStartCommand(DataInputStream inputStream) {
  String content = inputStream.readUTF();
  while (!START_COMMAND.equals(content)) {
    content = inputStream.readUTF();
  }
}

The aspect that disturbed me most about this solution was calling inputStream.readUTF() twice (Remember: DRY). So I refactored and came up with:

private void waitForStartCommand(DataInputStream inputStream) {
  String content = null;
  do {
    content = inputStream.readUTF();
  } while (!START_COMMAND.equals(content)) {
}

In this version the need to declare and initialize a variable grants far too much meaning to an unimportant detail. So, a little refactoring resulted in the final version:

private void waitForStartCommand(DataInputStream inputStream) {
  while (startCommandIsNotReadOn(inputStream)) {
    continue;
  }
}

private boolean startCommandIsNotReadOn(DataInputStream inputStream) {
  return !START_COMMAND.equals(inputStream.readUTF());
}

This example shows pretty well how even rather simple code may need to be refactored several times in order to be highly readably and understandable. Especially code that handles more or less unimportant side aspects, should be as easily to understand as possible in order to avoid conveying the impression of being of major importance.

Readability of Boolean Expressions

Readability of boolean expressions lies in the eyes of the beholder.

Following up on various previous posts on code readability and style I want to provide two more examples today – this time under the common theme of “handling of boolean values”.

Consider this (1a):

bool someMethod()
{
  if (expression) {
    return true;
  } else {
    return false;
  }
}

Yes, there are people who consider this more readable than (1b)

bool someMethod()
{
  return (expression);
}

Another example is this (2a):

  if (someExpression() == true)
    ...

versus my preferred version (2b):

  if (someExpression())
    ...

So what could be the reason for these different viewpoints? One explanation I thought of is as follows: Let’s say you have a background in C and you are therefore used to do something like:

#define FALSE (0)
#define TRUE (!FALSE)

In other words, you may not see boolean as a type of its own, like int and double, with a well-defined value range. Instead you see it more like an enumerated type which makes it feel very naturally do a expression == true comparison.

At the same time it feels not very natural to see the result of a boolean expression as being of type bool with all the consequences – e.g. to be able to return it immediately as in the first example.

Another explanation is that 1a and 2a are as verbose as it can be. You don’t have to make any mental efforts to understand what the code does.

While these may be possible explanations, my guess is that most of you, like me,  still see 1a and 2a as unnecessary visual clutter and consider 1b and 2b as far more readable.

Separate your code domains

You can improve your code reusability by separating the technical domain code from the business domain code. This article tries to explain how to start.

When you develop software, you most likely have to think in two target domains at the same time. One domain will be the world of your stakeholder. He might talk about business rules and business processes and business everything, so lets call it the business domain. The other domain is the world you own exclusively with your colleagues, it’s the world of computers, programming languages and coding standards. Lets call it the technical domain. It’s the world where your stakeholders will never follow you.

Mixing the domains

Whenever you create source code, you probably try to solve problems in the business domain with the means of your technical domain – e.g. the programming language you’ve chosen on the hardware platform you anticipate the software to run on. Inevitably, you’ll mix parts of the business domain with parts of the technical domain. The main question is – will it blend? Most of the time, the answer is yes. Like milk in coffee, the parts of two domains will blend into an inseparable mixture. Which isn’t necessarily a bad thing – your solution works just fine.

The hard part comes when you want to reuse your code. It’s like reusing the milk in your coffee, but without the coffee. You’ve probably done it, too (reusing domain-blended code, not extracting the milk from your coffee) and it wasn’t the easy “just copy it over here and everything’s fine” reusability you’ve dreamt of.

Separating the domains

One solution for this task begins by realizing which code belongs to which domain. There isn’t a clear set of rules that you can just check and be sure, but we’ve found two rules of thumb helpful for this decision:

  • If you have a strong business domain data type model in your code (that is, you’ve modelled many classes to directly represent concepts and items from your stakeholder’s world), you can look at a line of code and scan for words from the business domain. If there aren’t any, chances are that you’ve found a line belonging to the technical domain. If you prefer to model your data structures with lists and hashmaps containing strings and integers, you’re mostly out of luck here. Hopefully, you’ve chosen explicit names for your variables, so you don’t end with a line stating map.get(key), when in fact, you’re looking up orders.getFor(orderNumber).
  • For every line of code, you can ask yourself “do I want to write it or do I have to write it?”. This question assumes that you really want to solve the problems of the business domain. Every line of code you just have to write because otherwise, the compiler, the QA department, your colleagues or, ultimately, your coder idol of choice would be disappointed is a line from the technical domain. Every line of code that would only disappoint your stakeholder if it would be missing is a line from the business domain. Most likely, everything that your business-driven tests assert is code from the business domain.

Once you categorized your lines of code into their associated domain, you can increase the reusability of your code by separating these lines of code. Effectively, you try to avoid the blending of the parts, much like in a good latte macchiato. If you achieve a clear separation of the different code parts, chances are that you have come a long way to the anticipated “copy and paste” reusability.

Example one: Local separation

Well, all theory is nice and shiny, but what about the real (coding) life? Here are two examples that show the mechanics of the separation process.

In the first example, we’re given a compressed zip file archive as an InputStream. Our task is to write the archive entries to disk, given that certain rules apply:

public void extractEntriesFrom(InputStream in) {
    ZipInputStream zipStream = new ZipInputStream(in);
    try {
         ZipEntry entry = null;
         while ((entry = zipStream.getNextEntry()) != null) {
             if (rulesApplyFor(entry)) {
                 File newFile = new File(entry.getName());
                 writeEntry(zipStream,
                      getOutputStream(basePath(), newFile));
             }
             zipStream.closeEntry();
         }
    } catch (IOException e) {
        e.printStackTrace();
    } finally {
        IOHandler.close(zipStream);
    }
}

This is fairly common code, nothing to be proud of (we can argue that the method signature isn’t as explicit as it should be, the exceptions are poorly handled, etc.), but that’s not the point of this example. Try to focus your attention to the domain of each code line. Is it from the business or the technical domain? Let me refactor the example to a form where the code from both domains is separated, without changing the additional flaws of the code:

public void extractEntriesFrom(InputStream in) {
    ZipInputStream zipStream = new ZipInputStream(in);
    try {
         ZipEntry entry = null;
         while ((entry = zipStream.getNextEntry()) != null) {
             handleEntry(entry, zipStream);
             zipStream.closeEntry();
         }
    } catch (IOException e) {
        e.printStackTrace();
    } finally {
        IOHandler.close(zipStream);
    }
}

protected void handleEntry(ZipEntry entry,
        ZipInputStream zipStream) throws IOException {
    if (rulesApplyFor(entry)) {
        File newFile = new File(entry.getName());
        writeEntry(zipStream,
            getOutputStream(basePath(), newFile));
    }
}

In this version of the same code, the method extractEntriesFrom(…) doesn’t know anything about rules or how to write an entry to the disk. Everything that’s left in the method is part of the technical domain – code you have to write in order to perform something useful within the business domain. The new method handleEntry(…) is nearly free of technical domain stuff. Every line in this method depends on the specific use case, given by your business domain.

Example two: Full separation

Technically, the first example only consisted of a simple refactoring (Extract Method). But by separating the code domains, we’ve done the first step of a journey towards code reusability. It begins with a simple refactoring and ends with separated classes in separated packages from two separated project parts, named something like “application” and “framework”. Even if you only find a class named “Tools” or “Utils” in your project, you’ve done intermediate steps towards the goal: Separating your technical domain code from your business domain code in order to reuse the former (because no two businesses are alike).

The next example shows a full separation in action:

WriteTo.file(target).using(new Writing() {
    @Override
    public void writeTo(PrintWriter writer) {
        writer.println("Hello world!");
        writer.println("Hello second line.");
        // more business domain code here
    }
});

Everything other than the first line (and the necessary java boilerplate) is business domain code. In the first line, only the specified target file isn’t technical. Everything related to opening the file output stream, handling exceptions, closing all resources and all the other fancy stuff you want to do when writing to a file is encapsulated in the WriteTo class. The equivalent to the handleEntry(…) method from the first example is the writeTo(…) method of the Writing interface. Everything within this method is purely business domain related code. The best thing is: you can nearly forget about the technical domain when filling out the method, as it is embedded in a reusable “code clamp” providing the proper context.

Conclusion

If you want to write reusable code, consider separating your two major code domains: the technical domain and the business domain. The first step is to be aware of the domains and distinguish between them. Your separation process then can start with simple extractions and finally lead to a purely technical framework where you “only” have to fill in the business domain code. By the way, it’s a variation of the classic “separation of concerns” principle, if you want to read more.

How I met my coding style

One of my students recently asked me where I got my coding style from. This blog post tries to answer that really good question.

One of my students recently asked a question that really stuck in my brain: “Where did you get your coding style from?”. The best part of the question was that I didn’t have an answer, until now. I care about coding style a lot and try to talk about it, too. Here are some coding style related blog posts to give you some examples:

Code squiggles were a crazy idea that really provided readability value even after the initial excitement was gone.

Readable code means that you can read the code out loud and it makes sense to (nearly) everyone.

The student wanted to know how to gather the experience to write readable, elegant or just plain crazy code. The answers he anticipated were books or “trial and error”. I’ve come to the conclusion that, while both sources provided enormous amounts of inspiration and knowledge, there’s another single vital ingredient that rounds up the mix: care.

The simple answer to the question is that I cared enough about coding style to gather some knowledge in this field. The more adequate answer is that a lot of factors helped me to improve my coding style.

Books with code examples

Early in my career, I was always looking for programming books with lots of code examples. This is a typical pattern for the novice and advanced beginner stages in the Dreyfus model of skill acquisition. I felt confident when I understood a piece of code and knew that I could produce something similar, given the proper amount of time.

It took a few years of exposure to actual real-life programming to discover that most code examples in books are rubbish. The code exists only as a “textbook example” for the given problem, it isn’t meant to be actually used outside the (often narrow) context. If you stick to copy&paste programming, your code will have the appeal of a patchwork clothing made out of rags. It might fulfill the requirement, but nothing more. You can’t learn style from programming books alone. This isn’t the fault of the books, by the way. Most programming books aren’t about style, but about programming or solving programmer’s problems.

There are a few precious exceptions from the general trend of bad code snippets. For example, Robert C. Martin’s “Clean Code” is a recent book with mostly well-done code listings. The best way to separate ugly code from nice one in books is to re-read them a few years later and try to improve the printed source code.

I encourage you to read as many books about programming as you can possibly handle. Even the bad ones will have an impact and inspire you in some way or the other. Books are like mentors, whereas good books are good mentors. Reading a programming book again after some time will show you how much knowledge you gained in the meantime and how much there’s still to be discovered.

Other people’s source code

The large amount of open source software available to be read in the original version is a gift to every programmer. You can dissect the source code of rock star programmers, scroll over the vast textual deserts of “enterprise code” or digest little gems of pure genius inside otherwise rather dull projects, all without additional cost except the time and attention you’re willing to invest.

When you reach a level of code reading when other people’s code “opens up” to you and you start to see the “deeper motives” or the “bigger picture”, whatever you call it, that’s a magical moment. Suddenly, you don’t read other people’s code, but other people’s minds as they lay out the fundamentals of their software. You’ll begin to read and understand code in larger and larger blocks and see structures that were right there before, but unbeknownst to you.

I doubt you can gain this ability by working with textbook examples, as they are restricted in scope by the very nature of limited print space and a specific topic. The one book that accomplishes something equivalent is “Growing Object-Oriented Software, Guided by Tests” by Steve Freeman and Nat Pryce. It’s a rare gem of truly holistic code examples mixed with the essence of many years of experience.

Other people

As beneficial as reading other people’s source code is, talking with them about it raises the whole experience to another level. If you can get in synch well enough to get past the initial buzzword bombing to show off your leetness, you can exchange coding experience worth many weeks in just a few minutes. And while you are talking with them, why not grab a notebook and type away your ideas?

Most programmers I’ve met, regardless of their skill level, could teach me something. And I’ve had several enlightening moments of novice programmers pointing out something or asking a question in a way that really inspired me. You’ll have to listen in order to learn, as painful or overwhelming as it may be sometimes.

Most people are very shy and insecure about their abilities. Encourage them to tell you more and show your interest in their work. Several noteworthy elements of my coding style were adopted late at night at a bar, when a fellow programmer finally had enough beer to brag about his code.

Own achievements

This one will hurt. Remember the times when you shouted out loud “what the f**k?” over some idiot’s source code? Let this idiot be yourself some months or a year ago. Read your own code. Refactor your own code. Rewrite your own code. It’s the same proceeding our brain performs when we dream at night: It rehashes old memories and experiences and lives through it again, in time lapse mode. As long as you don’t dream about programming (I’ve started to code in my dreams very early and it keeps getting more abstract over the time), the only way to rehash your old code it to live through it again by working with it again.

You’ll put many past achievements into perspective, remembering how proud you were and being embarrassed now. That’s part of the learning process and really shows your progress. Just be aware that today’s triumph will undergo the same transformation sooner or later.

Practice is a big part of gaining experience. And practicing means playing around, making errors, trying crazy new things and generally questioning everything you’ve learnt so far. Try to allocate as much practicing time as you can get (besides reading those books!) and really hone your skills. This works exceptionally well with other people, at a code camp, a code retreat, a hackathon, whatever you call it.

The right mindset

This is the secret ingredient to all the things listed above. Without the right mindset, everything in addition to your day-to-day work will appear like a chore. This doesn’t mean that you’ll have to sacrifice all your spare time to improve your coding style. It just means that you won’t mind reading a good book about programming at the weekend if you’re really enjoying it. You cannot learn this attitude from books or even blog posts, it’s a passion that you’ll have to develop on your own.

I can try to describe a big part of my passion: Nothing is good enough. There is never a “good enough”. I’m not falling into despair over it, it’s just an endless challenge for me. Tomorrow, I will be a little bit better than today. But even tomorrow, I’m not “good enough” and can still improve. Sometimes, my improvement rate is neglectable for a long time when suddenly an inspiration completely rearranges my (programming) world.

I made my last giant leaps in coding style when I deliberately avoided parts of my usual habits during programming and tried to focus on what I really wanted to do right now and how to express it in the best way possible (for me). The result was astonishing and humbling at the same time: there’s so much knowledge to gain, there’s so little I know. But yet, I keep getting better and that really makes me complete.

Looping in C++

What is “the best” way to loop over collections in C++?

One recurring discussion point in one of our customers C++ project team is the following:

What is “the best” way to loop over collections?

In a typical scenario there is a standard container like std::list, or some equivalent collection, and the task is to do something with every element in the collection. The straight forward way would be like this:

std::list<std::string> mylist;
for (std::list<std::string>::iterator iter = mylist.begin(); iter != mylist.end(); iter++)
{
   ...
}

This code is correct and readable. But my guess is that most of you instantly see at least two possible improvements:

  1. the call to mylist.end() occurs in every loop an can be expensive e.g. in case of long std::lists
  2. iter++ creates one unnecessary intermediate object on the stack

So this

for (std::list<std::string>::iterator iter = mylist.begin(), end = mylist.end(); iter != end; ++iter)
{
   ...
}

would be much better but can already be seen as a little less readable.

Using BOOST_FOREACH can save you much of this still tedious code but has one nasty pitfall when it comes to std::maps.

In some places of the code base std::for_each is used together with a function, or function object.  The downside of this is that the function/function object code is not located where the loop occurs. However, this can be made “readable enough” when the function, or function object does only one thing and has a telling name.

Looping is sometimes done to create other collections of objects for each element. What to do there? Define the new collection use a for-loop of BOOST_FOREACH like above, or use std::transform with the same downside as std::for_each?

The other day one team member suggested to use boost::lambda expressions in loops. The initial usage examples where very promising but let me tell you – readability can drop dramatically very fast if you don’t be careful. It is very easy to get carried away with boost’s lambdas. I happened that we found ourselves having spent the last hour to carve out a super crisp lambda expression that takes anybody else another hour to read.

So the initial question remains undecided and will most likely stay like that. As for everything else in programming, there doesn’t seem to be a silver bullet for this task.

How do you go about looping in C++? Do you have some kind of coding style in place? Do you use std::for_each, BOOST_FOREACH, or some other means?

Looking forward to some feedback.