TDD – Schneide Blog

What dependent types can do for you

In a way, this post is also about Test Driven Developement and *Type* Driven Developement. While the two share the same acronym, I always thought of them as different concepts. However, as I recently experienced, when the two concepts are used in a dependently typed language, there is something like a fluid transition between them.

While I will talk about programming in the dependently typed language Agda, not much is needed to follow what is going on – I will just walk through an exercise and explain everything along the way.

The exercise I want to use, is here. It talks about a submarine, its position and certain commands, that change the position. Examples for commands are forward 1, down 2 and up 3. These ‘values’ can be used just like that with the following definition of the type of commands:

data Command : Set where
forward : Nat -> Command
up : Nat -> Command
down : Nat -> Command

Agda can be used in a very mathy way – this should really be read as saying, that the type of commands is a Set and there are three constructors (highlighted green) which take a natural number as argument and produce a command. So, using that application is just juxtaposition, we can make the following definitions now:

justSomeCommand = forward 5
anotherOne = up 1

Now the exercise text explains, how these commands can be applied to the position of the submarine. Working as a software developer, I built the habit of turning specifications like that into tests. Since I don’t know any better, I just wrote ‘tests’ in Agda using equations to translate the exercise text – I’ll explain the syntax below:

apply (forward 5) (pos 0 0) ≡ pos 5 0
apply (down 5) (pos 5 0) ≡ pos 5 5
apply (forward 8) (pos 5 5) ≡ pos 13 5

Note that the triple equal sign is different from what we used above. Roughly, this is because it is the proposition, that some tings are equal, while the normal equal sign above, was used to make definitions. The code doesn’t type check as is. We haven’t defined ‘apply’ and it is not valid Agda to just write down equations like that. Let’s fix the latter problem first, by turning it into declarations and definitions. This will actually define elements of the datatypes of equality proofs – but I’m pretty sure you can accept these changes just as boilerplate we have to add to our equations:

example1 : apply (forward 5) (pos 0 0) ≡ pos 5 0
example1 = refl

example2 : apply (down 5) (pos 5 0) ≡ pos 5 5
example2 = refl

example3 : apply (forward 8) (pos 5 5) ≡ pos 13 5
example3 = refl

Now, to make the examples type-check, we have to define ‘pos’ and ‘apply’. Positions can be done analogous to commands:

data Position : Set where
pos : Nat -> Nat -> Position

(Here, the type of ‘pos’ just tells us, that it is a function taking two natural numbers as arguments.) Now we are ready to start with ‘apply’:

apply : Command → Position → Position
apply = ?

So apply is a function, that takes a ‘Command’ and a ‘Position’ and returns another ‘Position’. For the definition of ‘apply’ I just entered a questionmark ‘?’. It is one of my favorite features of Agda, that terms can be left out like this before type checking. Agda still checks everything we have given so far and will give us a lot of information about what ‘?’ could be. This is called ‘interacting with a hole’. Because, well, it is a hole in your code and the type checker is there to tell you, which things might fit into this hole. After type checking, the hole and what Agda tells us about it, will look like this:

apply : Command → Position → Position
apply = ?

Goal: Command -> Position -> Position

This was type-checked with a couple of imports – see my final version of the code if you want to reproduce. The first thing Agda tells us, is the type of the goal and then there is some mumbling about constraints with some fragments, that look like they have something to do with the examples from above – the latter is actually not information about the hole, but general information about the type checking. So lets look at them to see, if the type checker has to say anything:

The refl terms in the definitions of the examples, are highlighted yellow

Something is yellow! This is Agda’s way to tell us, that it does not have enough information to decide, if everything is okay. Which makes a lot of sense, since we haven’t given a definition of ‘apply’ and these equations are about values computed with ‘apply’. So let us just continue to define ‘apply’ and see if the yellow vanishes. This is analogous to the stage in TDD were your tests don’t pass because your code does not yet compile.

We will use pattern matching on the given ‘Command’ and ‘Position’ to define ‘apply’ – the cases below were generated by Agda (I only changed variable names), and we now have a hole for each case:

apply : Command → Position → Position
apply (forward x) (pos h d) = {!!}
apply (up x) (pos h d) = {!!}
apply (down x) (pos h d) = {!!}

There are various ways in which Agda can use the information given by types to help us with filling these holes. First of all, we can just ask Agda to make the hole ‘smaller’ if there is a unique canonical way to do so. This will work here, since ‘Position’ has only one constructor. So we get new holes for arguments of the constructor ‘pos’ and can try to fill those.

Let us focus on the first case and see what happens if we enter something not in line with our tests:

If we ask Agda, if ‘h+d’ fits into the ‘hole’, it will say no and tell us what the problem is in the following way:

While this is essentially the same kind of feedback you would get from a unit test, there are at least two important advantages to note:

This is feedback from the type checker and it is combined with other things the type checker can tell you. It means you get a lot of feedback at once, when you ask Agda, if something you wrote fits into a hole.
‘refl’ is only a simple case of the proves you can write in Agda. More complicated ones need some training, but you can go way beyond unit tests and ‘check’ infinitely many cases or even better: all cases.

If you want more, just try Agda yourself! One easy way to do that, is to use Ingo Blechschmidt’s Agdapad, which let’s you try Agda in your browser.

Recap of the Schneide Dev Brunch 2014-06-22

If you couldn’t attend the Schneide Dev Brunch at 22nd of June, here is a summary of the main topics.

Yesterday, we held another Schneide Dev Brunch at last. The Dev Brunch is a regular brunch on a sunday, only that all attendees want to talk about software development and various other topics. If you bring a software-related topic along with your food, everyone has something to share. The brunch was smaller this time, but we held the last brunch only three weeks ago. We had bright sunny weather and used our roof garden, but hurdled in the shadows. There were lots of topics and chatter. As always, this recapitulation tries to highlight the main topics of the brunch, but cannot reiterate everything that was spoken. If you were there, you probably find this list inconclusive:

Student again – from full employee to university

One of our attendees worked as a full-time software developer in the recent years and decided to study again. He told us about the practical challenges of an employee turned student:

The bureaucracy at universities is highly developed and not on your side. It takes days to accomplish the tiniest step towards matriculation.
To listen again. In the developer world, two hours of highly concentrated programming is satisfying, but two hours of concentrated listening to somebody who tells the important stuff only once is very hard. You are allowed to doze off (just like in big meetings), but it won’t do you no good.
Higher-level mathematics. Suddenly, all that stuff about fourier transformation and matrices is very important again.
Running on a lower gear. It seems like heaven to replace a 40h work week with a 20h study week, but the irregular pace (one day no lectures, one day lectures around the clock, etc.) will take its toll.
Self-organization. Good developers are of course self-organized and know what to do: work on the most important issues in your issue tracker/todo list. But university will not write issues for you and you are the one to fill the todo list. We joked that a “master’s student JIRA” would actually be a good idea.

It was a very entertaining talk and we digressed lots of times. Let’s have a look at some artifacts we came across during our discussion:

There seems to be a growing influence from military concepts on management. On book was specifically mentioned: “Turn the ship around” by David Marquet.
“Bad work, good work and great work”. It’s a marketing video, but contains a message nonetheless. One practical advice is to not include “bad work knowledge” in your curriculum vitae, even if you have expertise in it. This minimizes the risk that your next job will contain a lot of “bad work” again.

The current state of JavaFX

Last year, JavaFX was aggressively marketed by Oracle as the next big thing in desktop UI. The claims and promises seem to finally be fulfilled. The combination of Java 8 and the latest JavaFX is especially joy-bringing. The JavaFX core is included in Java 8 and brings a lot of little but essential improvements. You can layout and design your graphical interfaces with a WYSIWYG editor and store it in XML-based layout files. These layouts are loaded, combined with custom logic and bound to custom data sources. The styling is based on a slightly outdated CSS dialect, but very powerful and done right in comparison to styling in past toolkit like Swing or SWT.

The best thing about JavaFX is that much less GUI code is needed for more pleasant user experiences. The toolkit feels alive and the details tell that the developers care and eat their own dogfood. Integration in the Eclipse IDE is enhanced by the e(fx)clipse project.

Our attendee has hands-on experience with Swing, SWT/JFace and JavaFX. If pressed to choose the technology for a new project, he would choose JavaFX anytime now.

Efficiency killers

We also traded tales about the most efficient efficiency killers in software development that we actually observed or endured:

Taking away notebooks and desktop computers and replacing them with zero clients – for developers that really need their multi-cores and gigabytes.
Restricting every employee to one (and only one!) computer. If you happen to choose a notebook, you probably don’t need that extra monitor, do you?
Installing a “privilege management” software. Basically, this software works like a firewall against user inputs. You want to install a new printer driver? It will be cheaper and faster to carve your text in stone slabs.
Fragmenting the company networks. This is actually a very good idea. You can have a wild-west style network for developers and a “privilege managed” one for management. It gets complicated when you need to cross the canyons everytime to get work done. Just imagine that your repository is behind a firewall and you need a clearance every time you want to commit/push.

Introduction to warfare

The last main topic was an overview of a self-study on warfare. And because the typical software developer won’t move whole armies around, it concentrated more on the principles and strategies of “common” warfare, which includes everyday conflicts as well as campaigns for a certain goal (e.g. establishing a technology). Three books serve as stepping stones:

“The art of war” by Sun Tzu. The ancient classic book about warfare. There are probably a dozen different translations of the original chinese text, but they will only serve as a starter. This book alone will probably don’t give you deep insights, but you will come back to it often when you venture deeper into the mindset of warriors and generals.
“Die Kunst der List” by Harro von Senger. This german book describes the political and rhetorical battle moves that are often used in everyday life. These so-called strategems can be identified and parried if you know about them. Most of us can react to some strategems by learnt lessons, but it certainly helps to be keen about the rest of them, too.
“The 33 Strategies of War” by Robert Greene. This book doesn’t mess around. It is part of the “immoral series” of books about topics that don’t get discussed in this clarity often. You’ll learn much about strategies, their actual application in history (not only on the battlefield, but on movie sets, offices and political stages) and how to counter them. The book has a lot of content and many ideas and concepts to think about. And it contains an exhaustive list of literature to continue reading.

Is TDD dead?

We continued our discussion about the debate around David Heinemeier Hansson’s frontal attack on the hype around Test Driven Development. There were some new insights and an improved understanding about the manyfold contents of the hangout discussions.

Epilogue

As usual, the Dev Brunch contained a lot more chatter and talk than listed here. The high number of attendees makes for an unique experience every time. We are looking forward to the next Dev Brunch at the Softwareschneiderei. And as always, we are open for guests and future regulars. Just drop us a notice and we’ll invite you over next time.

How the most interesting IT debate is revealing our values as software developers

TDD is dead. Is TDD dead? A question that seems to divide our profession. What does this debate have to do with you?

TDD is dead. Is TDD dead? A question that seems to divide our profession.
On the one side: developers which write their tests first and let them drive their code. They prefer the mockist approach to testing. Code should be tested in isolation, under lab like circumstances. Clean code is their book. Practices and principles guide their thinking. An application should not be bound to frameworks and have a hexagonal architecture. The GOOS book showed how it can be done.
On the other side: developers which focus on readability and clarity. They use their experience and gut to drive their decisions. Because of past experiences they test their the code the classical way. They are pragmatic. Practices and principles are used when they improve the understanding of the code. Code is there to be refactored. Just like a gardener trims bushes and a writer edits his prose they work with their code.

What are your values?

What does this debate have to do with you?

Ask yourself:
What if you could write a proof of your program costing 10 or just 5 times as much as the implementation? It would prove your code would work correctly under all possible circumstances. Would you do it?

Or would you rather improve the existing architecture, design or clarity of your code? So that you remove technical debt and are better positioned for future changes.

Or would you write new features and improve your application for the people using it?

What are your values?

History

At the beginnings of my developer life in the late 80s/early 90s I remember that the industry was focussed on one goal: code reuse. Modules, components, libraries, frameworks were introduced. Then patterns came. All of that was working towards one side of the equation: low coupling.
High cohesion was neglected in pursuit of a noble goal. But what happened? The imbalance produced layer after layer, indirection after indirection, over-separation and over-abstraction. You had to deal with dependency injection (containers), configuration, class hierarchies, interfaces, event buses, callbacks, … just to understand a hello world.
Today we have more computing power and are solving more and more complex things. We think in higher abstractions. Much more people benefit from our skills and our works.
On the user facing side design focusses on simplicity and usability. Even complex relationships can be made understandable and manageable. A wise man once said: design is about intent.
The same with code: Code is about intent. Intent should be the measure of the quality of our code. Not testability, not coupling: intent. If the code (and this includes the code comments) would reveal its intent, you could fix bugs in it, improve it, change it, refactor it. Tests would be your safety net to ensure you are not breaking your intent.
You might say: but this is what TDD is all about! But I think we got it all backwards. The code and its intention revealing nature is more important than the tests. The tests support. But tests should never replace or even harm the clarity of the code.
The quality of the code is important. But most important are the people using your application.
My goal is to delight the people who use my software and my way there is writing intention revealing software. I am not there and I am learning every day but I take step after step.

What are your values?

The difference between Test First and Test Driven Development

If you tend to fall into the “one-two-everything”-trap while doing TDD, this blog post might give you some perspective on what really happens and how to avoid the trap by decomposing the problem.

The concept of Test First (“TF”, write a failing test first and make it green by writing exactly enough production code to do so) was always very appealing to me. But I’ve never experienced the guiding effect that is described for Test Driven Development (“TDD”, apply a Test First approach to a problem using baby steps, letting the tests drive the production code). This lead to quite some frustration and scepticism on my side. After a lot of attempts and training sessions with experienced TDD practioners, I concluded that while I grasped Test First and could apply it to everyday tasks, I wouldn’t be able to incorporate TDD into my process toolbox. My biggest grievance was that I couldn’t even tell why TDD failed for me.

The bad news is that TDD still lies outside my normal toolbox. The good news is that I can pinpoint a specific area where I need training in order to learn TDD properly. This blog post is the story about my revelation. I hope that you can gather some ideas for your own progress, implied that you’re no TDD master, too.

A simple training session

In order to learn TDD, I always look for fitting problems to apply it to. While developing a repository difference tracker, the Diffibrillator, there was a neat little task to order the entries of several lists of commits into a single, chronologically ordered list. I delayed the implementation of the needed algorithm for a TDD session in a relaxed environment. My mind began to spawn background processes about possible solutions. When I finally had a chance to start my session, one solution had already crystallized in my imagination:

An elegant solution

Given several input lists of commits, now used as queues, and one result list that is initially empty, repeat the following step until no more commits are pending in any input queue: Compare the head commits of all input queues by their commit date and remove the oldest one, adding it to the result list.
I nick-named this approach the “PEZ algorithm” because each commit list acts like the old PEZ candy dispensers of my childhood, always giving out the topmost sherbet shred when asked for.

A Test First approach

Trying to break the problem down into baby-stepped unit tests, I fell into the “one-two-everything”-trap once again. See for yourself what tests I wrote:

@Test
public void emptyIteratorWhenNoBranchesGiven() throws Exception {
  Iterable<ProjectBranch> noBranches = new EmptyIterable<>();
  Iterable<Commit> commits = CombineCommits.from(noBranches).byCommitDate();
  assertThat(commits, is(emptyIterable()));
}

The first test only prepares the classes’ interface, naming the methods and trying to establish a fluent coding style.

@Test
public void commitsOfBranchIfOnlyOneGiven() throws Exception {
  final Commit firstCommit = commitAt(10L);
  final Commit secondCommit = commitAt(20L);
  final ProjectBranch branch = branchWith(secondCommit, firstCommit);
  Iterable<Commit> commits = CombineCommits.from(branch).byCommitDate();
  assertThat(commits, contains(secondCommit, firstCommit));
}

The second test was the inevitable “simple and dumb” starting point for a journey led by the tests (hopefully). It didn’t lead to any meaningful production code. Obviously, a bigger test scenario was needed:

@Test
public void commitsOfSeveralBranchesInChronologicalOrder() throws Exception {
  final Commit commitA_1 = commitAt(10L);
  final Commit commitB_2 = commitAt(20L);
  final Commit commitA_3 = commitAt(30L);
  final Commit commitA_4 = commitAt(40L);
  final Commit commitB_5 = commitAt(50L);
  final Commit commitA_6 = commitAt(60L);
  final ProjectBranch branchA = branchWith(commitA_6, commitA_4, commitA_3, commitA_1);
  final ProjectBranch branchB = branchWith(commitB_5, commitB_2);
  Iterable<Commit> commits = CombineCommits.from(branchA, branchB).byCommitDate();
  assertThat(commits, contains(commitA_6, commitB_5, commitA_4, commitA_3, commitB_2, commitA_1));
}

Now we are talking! If you give the CombineCommits class two branches with intertwined commit dates, the result will be a chronologically ordered collection. The only problem with this test? It needed the complete 100 lines of algorithm code to be green again. There it is: the “one-two-everything”-trap. The first two tests are merely finger exercises that don’t assert very much. Usually the third test is the last one to be written for a long time, because it requires a lot of work on the production side of code. After this test, the implementation is mostly completed, with 130 lines of production code and a line coverage of nearly 98%. There wasn’t much guidance from the tests, it was more of a “holding back until a test allows for the whole thing to be written”. Emotionally, the tests only hindered me from jotting down the algorithm I already envisioned and when I finally got permission to “show off”, I dived into the production code and only returned when the whole thing was finished. A lot of ego filled in the lines, but I didn’t realize it right away.

But wait, there is a detail left out from the test above that needs to be explicitely specified: If two commmits happen at the same time, there should be a defined behaviour for the combiner. I declare that the order of the input queues is used as a secondary ordering criterium:

@Test
public void decidesForFirstBranchIfCommitsAtSameDate() throws Exception {
  final Commit commitA_1 = commitAt(10L);
  final Commit commitB_2 = commitAt(10L);
  final Commit commitA_3 = commitAt(20L);
  final ProjectBranch branchA = branchWith(commitA_3, commitA_1);
  final ProjectBranch branchB = branchWith(commitB_2);
  Iterable<Commit> commits = CombineCommits.from(branchA, branchB).byCommitDate();
  assertThat(commits, contains(commitA_3, commitA_1, commitB_2));
}

This test didn’t improve the line coverage and was green right from the start, because the implementation already acted as required. There was no guidance in this test, only assurance.

And that was my session: The four unit tests cover the anticipated algorithm completely, but didn’t provide any guidance that I could grasp. I was very disappointed, because the “one-two-everything”-trap is a well-known anti-pattern for my TDD experiences and I still fell right into it.

A second approach using TDD

I decided to remove my code again and pair with my co-worker Jens, who formulated a theory about finding the next test by only changing one facet of the problem for each new test. Sounds interesting? It is! Let’s see where it got us:

@Test
public void noBranchesResultsInEmptyTrail() throws Exception {
  CommitCombiner combiner = new CommitCombiner();
  Iterable<Commit> trail = combiner.getTrail();
  assertThat(trail, is(emptyIterable()));
}

The first test starts as no big surprise, it only sets “the mood”. Notice how we decided to keep the CommitCombiner class simple and plain in its interface as long as the tests don’t get cumbersome.

@Test
public void emptyBranchesResultInEmptyTrail() throws Exception {
  ProjectBranch branchA = branchFor();
  CommitCombiner combiner = new CommitCombiner(branchA);
  assertThat(combiner.getTrail(), is(emptyIterable()));
}

The second test asserts only one thing more than the initial test: If the combiner is given empty commit queues (“branches”) instead of none like in the first test, it still returns an empty result collection (the commit “trail”).

With the single-facet approach, we can only change our tested scenario in one “domain dimension” and only the smallest possible amount of it. So we formulate a test that still uses one branch only, but with one commit in it:

@Test
public void branchWithCommitResultsInEqualTrail() throws Exception {
  Commit commitA1 = commitAt(10L);
  ProjectBranch branchA = branchFor(commitA1);
  CommitCombiner combiner = new CommitCombiner(branchA);
  assertThat(combiner.getTrail(), Matchers.contains(commitA1));
}

With this test, there was the first meaningful appearance of production code. We kept it very simple and trusted our future tests to guide the way to a more complex version.

The next test introduces the central piece of domain knowledge to the production code, just by changing the amount of commits on the only given branch from “one” to “many” (three):

@Test
public void branchWithCommitsAreReturnedInOrder() throws Exception {
  Commit commitA1 = commitAt(10L);
  Commit commitA2 = commitAt(20L);
  Commit commitA3 = commitAt(30L);
  ProjectBranch branchA = branchFor(commitA3, commitA2, commitA1);
  CommitCombiner combiner = new CommitCombiner(branchA);
  assertThat(combiner.getTrail(), Matchers.contains(commitA3, commitA2, commitA1));
}

Notice how this requires the production code to come up with the notion of comparable commit dates that needs to be ordered. We haven’t even introduced a second branch into the scenario yet but are already asserting that the topmost mission critical functionality works: commit ordering.

Now we need to advance to another requirement: The ability to combine branches. But whatever we develop in the future, it can never break the most important aspect of our implementation.

@Test
public void twoBranchesWithOnlyOneCommit() throws Exception {
  Commit commitA1 = commitAt(10L);
  ProjectBranch branchA = branchFor(commitA1);
  ProjectBranch branchB = branchFor();
  CommitCombiner combiner = new CommitCombiner(branchA, branchB);
  assertThat(combiner.getTrail(), Matchers.contains(commitA1));
}

You might say that we knew about this behaviour of the production code before, when we added the test named “branchWithCommitResultsInEqualTrail”, but it really is the assurance that things don’t change just because the amount of branches changes.

Our production code had no need to advance as far as we could already anticipate, so there is the need for another test dealing with multiple branches:

@Test
public void allBranchesAreUsed() throws Exception {
  Commit commitA1 = commitAt(10L);
  ProjectBranch branchA = branchFor(commitA1);
  ProjectBranch branchB = branchFor();
  CommitCombiner combiner = new CommitCombiner(branchB, branchA);
  assertThat(combiner.getTrail(), Matchers.contains(commitA1));
}

Note that the only thing that’s different is the order in which the branches are given to the CommitCombiner. With this simple test, there needs to be some important improvements in the production code. Try it for yourself to see the effect!

Finally, it is time to formulate a test that brings the two facets of our algorithm together. We tested the facets separately for so long now that this test feels like the first “real” test, asserting a “real” use case:

@Test
public void twoBranchesWithOneCommitEach() throws Exception {
  Commit commitA1 = commitAt(10L);
  Commit commitB1 = commitAt(20L);
  ProjectBranch branchA = branchFor(commitA1);
  ProjectBranch branchB = branchFor(commitB1);
  CommitCombiner combiner = new CommitCombiner(branchA, branchB);
  assertThat(combiner.getTrail(), Matchers.contains(commitB1, commitA1));
}

If you compare this “full” test case to the third test case in my first approach, you’ll see that it lacks all the mingled complexity of the first try. The test can be clear and concise in its scenario because it can rely on the assurances of the previous tests. The third test in the first approach couldn’t rely on any meaningful single-faceted “support” test. That’s the main difference! This is my error in the first approach: Trying to cramp more than one new facet in the next test, even putting all required facets in there at once. No wonder that the production code needed “everything” when the test requires it. No wonder there’s no guidance from the tests when I wanted to reach all my goals at once. Decomposing the problem at hand into independent “features” or facets is the most essential step to learn in order to advance from Test First to Test Driven Development. Finding a suitable “dramatic composition” for the tests is another important ability, but it can only be applied after the decomposition is done.

But wait, there is a fourth test in my first approach that needs to be tested here, too:

@Test
public void twoBranchesWithCommitsAtSameTime() throws Exception {
  Commit commitA1 = commitAt(10L);
  Commit commitB1 = commitAt(10L);
  ProjectBranch branchA = branchFor(commitA1);
  ProjectBranch branchB = branchFor(commitB1);
  CommitCombiner combiner = new CommitCombiner(branchA, branchB);
  assertThat(combiner.getTrail(), Matchers.contains(commitA1, commitB1));
}

Thankfully, the implementation already provided this feature. We are done! And in this moment, my ego showed up again: “That implementation is an insult to my developer honour!” I shouted. Keep in mind that I just threw away a beautiful 130-lines piece of algorithm for this alternate implementation:

public class CommitCombiner {
  private final ProjectBranch[] branches;

  public CommitCombiner(ProjectBranch... branches) {
    this.branches = branches;
  }

  public Iterable<Commit> getTrail() {
    final List<Commit> result = new ArrayList<>();
    for (ProjectBranch each : this.branches) {
      CollectionUtil.addAll(result, each.commits());
    }
    return sortedWithBranchOrderPreserved(result);
  }

  private Iterable<Commit> sortedWithBranchOrderPreserved(List<Commit> result) {
    Collections.sort(result, antichronologically());
    return result;
  }

  private <D extends Dated> Comparator<D> antichronologically() {
    return new Comparator<D>() {
      @Override
      public int compare(D o1, D o2) {
        return o2.getDate().compareTo(o1.getDate());
      }
    };
  }
}

The final and complete second implementation, guided to by the tests, is merely six lines of active code with some boiler-plate! Well, what did I expect? TDD doesn’t lead to particularly elegant solutions, it leads to the simplest thing that could possibly work and assures you that it will work in the realm of your specification. There’s no place for the programmer’s ego between these lines and that’s a good thing.

Conclusion

Thank you for reading until here! I’ve learnt an important lesson that day (thank you, Jens!). And being able to pinpoint the main hindrance on my way to fully embracing TDD enabled me to further improve my skills even on my own. It felt like opening an ever-closed door for the first time. I hope you’ve extracted some insights from this write-up, too. Feel free to share them!

Does Refactoring turn unit test of TDD to integration tests?

We really value automated tests and do experiments regarding test driven development (TDD) and tests in general from time to time. In the retrospective of our lastest experiment this question struck me: Does refactoring turn the unit tests of TDD to integration tests over time?

Let me elaborate this a bit further. When you start out with your tests you have some unit of functionality – usually a class – in mind. As you add test after test your implementation slowly fleshes out. You are repeating the TDD cycle “Write a failing test – Make test pass – Refactor” as you are adding features. The refactoring step is crucial in the whole process because it keeps the code clean and evolvable. But this step is also the cause leading to my observation: As you add new features you may extract new classes when refactoring to obey the single responsibility principle (SRP) and keep your design sane. It is very easy to forget or just ignore refactoring the tests. They still pass. You still have the same code coverage. But your tests now test the combination of several units. And what’s worse: You have units without direct tests.

This happened even in relatively small experiments on “Communication through tests” where the recontructing team could sometimes only guess that some class existed and either went on without it or created the class out of neccessity. The problem with this is that there are no obvious and clear indicators that your unit tests are not real unit tests anymore.

Conclusion

I neither have any solution nor am I completely sure how big the problem is in practice. It may help to state the TDD cycle more explicitly like “Write a failing test – Make test pass – Refactor implementation and tests” although that is no 100% remedy. One could implement a simple, checkstyle-like tool which lists all units without associated test class. I will keep an eye on the phenomenom and try to analyse it further. I would love to hear you view and experience on the matter.

TDD: avoid getting stuck or what’s the next test?

One central point of practicing TDD is to determine what is the next test. Choosing the wrong path can lead you into the infamous impasse

One central point of practicing TDD is to determine what is the next test. Choosing the wrong path can lead you into the infamous impasse: to make the next test pass you need to make not baby but giant steps. Some time ago Uncle Bob introduced a principle called the transformation priority premise. To make a test pass you need to change the implementation. These changes are transformations. There are at least the following transformations (taken from his blog post):

({}–>nil) no code at all->code that employs nil
(nil->constant)
(constant->constant+) a simple constant to a more complex constant
(constant->scalar) replacing a constant with a variable or an argument
(statement->statements) adding more unconditional statements.
(unconditional->if) splitting the execution path
(scalar->array)
(array->container)
(statement->recursion)
(if->while)
(expression->function) replacing an expression with a function or algorithm
(variable->assignment) replacing the value of a variable.

To determine what the next test should be you look at the possible next tests and the changes in the implementation necessary to make that test pass. The required transformations should be as high in the list as possible. If you always choose the test which causes the highest transformations you avoid getting stuck, the impasse.
This seems to work but I think this is pretty complicated and expensive. Shouldn’t there be an easier way?
Let’s take a look at his case study: the word wrap kata. Word wrap is a function which takes two parameters: a string, and a column number. It returns the string, but with line breaks inserted at just the right places to make sure that no line is longer than the column number. You try to break lines at word boundaries.
The first three tests (nil, empty string and one word which is shorter than the wrap position) are obvious and easy but the next test can lead to an impasse:

@Test
public void twoWordsLongerThanLimitShouldWrap() throws Exception {
  assertThat(wrap("word word", 6), is("word\nword"));
}

With the transformation priority premise you can “calculate” that this is the wrong test and another one is simpler meaning needs transformations higher in the list. But let me introduce another concept: the facets or dimensions of tests.
Each test in a TDD session tests another facet of your problem. And only one more. What a facet is is determined by the problem domain. So you need some domain knowledge but usually to solve that problem you need this nevertheless. Back to the word wrap example: what is a facet? The first test tests the nil input, it changes one facet. The empty input test changes another facet. Then comes one word shorter than the wrap position (one facet changed again) and the fourth test uses two words longer than the wrap position. See it? The fourth tests introduces changes in two facets: one word to two word and shorter to longer than. So what can you do instead? Just change one facet. According to this the next test would be to use one word longer than the wrap position (facet: longer) which is proposed as a solution. Or you can use two words shorter than the wrap position (facet: word count) but this test will just pass without modifications to the implementation code. So facets of the word wrap kata could be: word count, shorter/longer, number of breaks, break position.
I know this is a very informal way of finding the next tests. It leans on your experience and domain knowledge. But I think it is less expensive than the transformations. And even better it can be combined with the transformation priority premise to check and verify your decisions.
What are you experiences with getting stuck in TDD? Do you think the proposed facets of TDD could be of help? Is it too informal? Too vague?

TDD myths: the problems

I take a look at some (in my experience) problems/misconceptions with TDD:
100% code coverage is enough, Debugging is not needed, Design for testability, You are faster than without tests

100% code coverage is enough

Code coverage seems to be a bad indicator for the quality of the tests. Take the following code as an example:

public void testEmptySum() {
  assertEquals(0, sum());
}

public void testSumOfMultipleNumbers() {
  assertEquals(5, sum(2, 3));
}

Now take a look at the implementation:

public int sum(int...numbers) {
  if (numbers.length == 0) {
    return 0;
  }
  return 5;
}

Baby steps in TDD could lead you to this implementation. It has 100% code coverage and all tests are green. But the implementation isn’t finished at all. Our experiment where we investigated how much tests communicate the intend of the code showed flaws in metrics like code coverage.

Debugging is not needed

One promise of TDD or tests in general is that you can neglect debugging. Even abandon it. In my experience when a test goes red (especially an integration test) you sometimes need to fire up the debugger. The debugger helps you to step through code and see the actual state of the system at that time. Tests treat code as a black box, an input results in an output. But what happens in between? How much do you want to couple your tests to your actual implementation steps? Do we need the tests to cover this aspect of software development? Maybe something along the lines as shown in Inventing on principle where the computer shows you the immediate steps your code takes could replace debugging but tests alone cannot do it.

Design for testability

A noble goal. But are tests your primary client? No. Other code is. Design for maintainability would be better. You will need to change your code, fix it, introduce new features, etc. Don’t get me wrong: You need tests and you need testability. But how much code do you write specifically for your tests? How much flexibility do you introduce because of your tests? What patterns do you use just because your tests need them? It’s like YAGNI for code exposure for tests. Code specifically written only for tests couples your code to your tests. Only things that need to be coupled should be. Is the choice of the underlying data structure important? Couple it, test it. If it isn’t, don’t expose it, don’t write a getter. Don’t break the information hiding principle if you don’t need to. If you couple your tests too much to your code every little change breaks your tests. This hinders maintenance. The important and difficult design question is: what is important. Test this.

You are faster than without tests

Some TDD practitioners claim that they are faster with TDD than without tests because the bugs and problems in your code will overwhelm you after a certain time. So with a certain level of complexity you are going faster with TDD. But where is this level? In my experience writing code without tests is 3x-4x faster than with TDD. For small applications. There are entire communities where many applications are written without or with only a few tests. But I wouldn’t write a large application without tests but at least my feeling is that in many cases I go much slower. Cases where I feel faster are specification heavy. Like parsing or writing formats, designing an algorithm or implementing a scientific formula. So the call is open on this one. What are your experiences? Do you feel slowed down by TDD?

TDD myths

Some TDD myths and what is true about them.

TDD, Test first or test immediately after are all the same

No. All methods result in having tests in the end. But especially in the TDD case your mind set is completely different. First the tests drive the design of your code. You construct your system piece by piece. Unit test for unit test. All code you write must have a test first and you use the tests to describe and reason about the external interface of your units. In TDD the tests represent the future clients using your code. In practice this leads to small(er) units.

In TDD the tests’ (main) objective is to prevent regression

No. Tests help immensely when you break the same code twice. But even more so tests help to structure your code and make it maintainable. When using TDD you tend to reduce your code and its flexibility because you need to write a test for every piece of functionality first. So over designing or implementing things you don’t need (breaking YAGNI or KISS) bites you doubly: in the code and in the tests. Also wrong design decisions like choosing an inappropriate data structure or representation hits you twice as hard. TDD emphasizes bad design decisions.

Test code is the same as production code

No. Test code should adhere too a similar quality level like production code. But you won’t write tests for your tests. Also conditionals and loops are a very bad idea in tests and should be avoided. Take the following example:

public void testSomething() {
  for (MyEnum value : values()) {
    assertEquals(expected, do(value))
  }
}

If you forgot an enum value the tests just passes. Even if you have no values in your enum it passes still. Conditions have the same problem: you introduce another path through your test which can be avoided or never taken. You could secure the other path through an assert but in some cases this is a hint that you broke another principle: single responsibility of tests.

DRY is harmful in tests

No. DRY (don’t repeat yourself) aims to reduce or eliminate duplication in logic. But often DRY is understood as removing code duplication. This is not the same! Code duplication can be essential in tests. You need all of the essential information in the test. This code should not be extracted or abstracted elsewhere. These code lines which may seem similar are not coupled logically. When you change one test, the other test is not affected.

TDD is hard

No and yes. For me learning TDD is like learning a new language. It certainly needs time. But if you do it often and repeatedly you learn more every time you use it. It’s a way of reasoning about a system, a way of thinking, a paradigm. When I started with TDD I thought it was impossible or unreasonable to use in cases other than where strong specs exist like parsing a format. But over time I value the driving part of TDD more and more. You can get into a TDD flow. TDD gives you a very good feeling of security when you refactor. It forces you beforehand to think about your intended use for your code. Which is good. It changes my way of seeing my code, one step at time. Some things are still hard: acceptance tests are unreasonably expensive. Just testing one thing needs discipline. Not jumping ahead of the tests and implementing too much code also. Finding the next unit of testing can be difficult, getting stuck can be frustrating. Just like learning a new language I think it is worth it.

Thoughts about TDD

Thoughts and links about test driven development

First a disclaimer: I think tests are a hallmark for professional software development, I like to write tests before the implementation but that’s not always easy or simple (for the difference please refer to Simple made easy). I find it hard to grasp test driven development (TDD) though. The difference between test first and test driven lies in the intention: in both cases tests are written before any implementation code but in TDD the tests drive the design of your implementation.

The problem with opinions of TDD is there are mostly extreme positions: some think “TDD is the (next) holy grail” or the ones which dismissed it. Though reading between the lines there are great discussions about how to do it and what problems arise. Many people (me included) are really trying to get value from TDD. Testing should be fun.
One way in letting the tests drive the way you develop is proposed by Uncle Bob: transformation priority premise. He proposes a list of transformations which introduce new or replace existing constructs like replacing a constant by a variable or adding more logic and gives them a priority. Only if you cannot use a high priority transformation to get the test to pass you look at a transformation with a lower priority.
But how do you determine what you should test next or even which is the first test?
Taking the typical Conway’s game of life kata as an example one thing struck me: I could only get the TDD to work smoothly when I started with the data structure. But why that? Naturally I start with the algorithm (in this case the rules) and write the first test for it. But upon further inspection of the problem and deeper (domain) knowledge it seems the data structure is way more important for solving this kata. So you need to know where the journey goes along beforehand, not every step you will take but the big picture: first the data structure, then the rules in this example. Maybe you should start with the integrations or the functional tests and break them down into units.
What are your experiences using TDD? Do you use or want to use TDD?

Summary of the Schneide Dev Brunch at 2012-05-27

If you couldn’t attend the Schneide Dev Brunch in May 2012, here are the main topics we discussed neatly summarized.

Yesterday, we held another Schneide Dev Brunch on our roofgarden. The Dev Brunch is a regular brunch on a sunday, only that all attendees want to talk about software development and various other topics. If you bring a software-related topic along with your food, everyone has something to share.

We had to do another introductory round because there were new participants with new and very interesting topics. This brunch was very well attended and rich in information. Let’s have a look at the main topics we discussed:

Agile wording (especially SCRUM)

This was just a quick overview over the common agile vocabulary and what ordinary people associate with them. A few examples are “scrum“, “sprint” and “master”. We agreed that some terms are flawed without deeper knowledge about the context in agile.

Book: “Please Understand Me”

if you are interested in the Myers-Briggs classification of personality types (keywords: are you INTJ, ESTP or INFP?), this is the book to go. It uses a variation of the personality test to classify and explain yourself, your motives and personal traits. And if you happen to know about the personality type of somebody else, it might open your eyes to the miscommunication that will likely occur sooner or later. Just don’t go overboard with it, it’s just a model about the most apparent personality characteristics. The german translation of the book is called “Versteh mich bitte” and has some flaws with typing and layouting errors. If you can overlook them, it might be the missing piece of insight (or empathy) you need to get through to somebody you know.

TV series: “Dollhouse”

As most of us are science fiction buffs and hold a special place in our heart for the series “Firefly”, the TV series “Dollhouse” by Joss Whedon should be a no-brainer to be interested in. This time, it lasted two seasons and brings up numerous important questions about programmability every software developer should have a personal answer for. Just a recommendation if you want to adopt another series with limited episode count.

Wolfpack Programming

A new concept of collaborative programming is “wolfpack programming” (refer to pages 21-26). It depends on a shared (web-based) editor that several developers use at once to develop code for the same tasks. The idea is that the team organizes itself like a pack of wolves hunting deer. Some alpha wolves lead groups of developers to a specific task and the hunt begins. Some wolves/developers are running/programming while the others supervise the situation and get involved when convenient. The whole code is “huntable”, so it sounds like a very chaotic experience. There are some tools and reports of experiments with wolfpack programming in Smalltalk. An interesting idea and maybe the next step beyond pair programming. Some more information about the editor can be found on their homepage and in this paper.

Book: “Durchstarten mit Scala”

Sorry for the german title, but the book in this review is a german introductory book about Scala. It’s not very big (around 200 pages) but covers a lot of topics in short, with a list of links and reading recommendations for deeper coverage. If you are a german developer and used to a modern object-oriented language, this book will keep its promise to kickstart you with Scala. Everything can be read and understood easily, with only a few topics that are more challenging than there are pages for them in the book. The topics range from build to test and other additional frameworks and tools, not just core Scala. This book got a recommendation for being concise, profound and understandable (as long as you can understand german).

Free Worktime Rule

This was a short report about employers that pay their developers a fixed salary, but don’t define the workload that should happen in return. Neither the work time nor the work content is specified or bounded. While this sounds great in the first place (two hours of work a week with full pay, anybody?), we came to the conclusion that peer pressure and intrinsic motivation will likely create a dangerous environment for eager developers. Most of us developers really want to work and need boundaries to not burn out in a short time. But an interesting thought nevertheless.

Experimental Eclipse Plugin: “Code_Readability”

This was the highlight of the Dev Brunch. One attendee presented his (early stage) plugin for Eclipse to reformat source code in a naturally readable manner. The effect is intriguing and very promising. We voted vehemently for early publication of the source code on github (or whatever hosting platform seems suitable). If the plugin is available, we will provide you with a link. The plugin has a tradition in the “Three refactorings to grace” article of the last Dev Brunch.

Light Table IDE

A short description of the new IDE concept named “Light Table”. While the idea itself isn’t new at all, the implementation is very inspirational. In short, Lighttable lets you program code and evaluates it on the fly, creating a full feedback loop in milliseconds. The effects on your programming habits are… well, see and try it for yourself, it’s definitely worth a look.

Inventing on Principles

Light Table and other cool projects are closely linked to Bret Victor, the speaker in the mind-blowing talk “Inventing on Principles”. While the talk is nearly an hour of playtime, you won’t regret listening. The first half of the talk is devoted to several demo projects Bret made to illustrate his way of solving problems and building things. They are worth a talk alone. But in the second half of the talk, Bret explains the philosophy behind his motivation and approach. He provides several examples of people who had a mission and kept implementing it. This is very valuable and inspiring stuff, it kept most of us on the edge of our seats in awe. Don’t miss this talk!

Albatros book page reminder (and Leselotte)

If you didn’t upgrade your reading experience to e-book readers yet, you might want to look at these little feature upgrades for conventional books. The Albatros bookmark is a page remembering indexer that updates itself without your intervention. We could test it on a book and it works. You might want to consider it especially for your travelling literature. This brought us to another feature that classic dead wood books are lacking: the self-sustained positioning. And there’s a solution, too: The “Leselotte” is a german implementation of the bean bag concept for a flexible book stand. It got a recommendation by an attendee, too.

Bullshit-Meter

If you ever wondered what you just read: It might have been bullshit. To test a text on its content of empty phrases, filler and hot air, you can use the blabla-meter for german or english text. Just don’t make the mistake to examine the last apidoc comments you hopefully have written. It might crush your already little motivation to write another one.

Review on Soplets

In one of the last talks on the Java User Group Karlsruhe, there was a presentation of “Soplets”, a new concept to program in Java. One of our attendees summarized the talk and the concept for us. You might want to check out Soplets on your own, but we weren’t convinced of the approach. There are many practical problems with the solution that aren’t addressed yet.

Review on TDD code camp

One of our attendees lead a code camp with students, targeting Test Driven Development as the basic ruleset for programming. The camp rules closely resembled the rules of Code Retreats by Corey Haines and had Conway’s Game of Life as the programming task, too. With only rudimentary knowledge about TDD and Test First, the students only needed four iterations to come up with really surprising and successful approaches. It was a great experience, but showed clearly how traditional approaches like “structured object-oriented analysis” stands in the way of TDD. Example: Before any test was written to help guide the way, most students decided on the complete type structure of the software and didn’t aberrate from this decision even when the tests told them to.

Report of Grails meetup

Earlier last week, the first informal Grails User Group Karlsruhe meeting was held. It started on a hot late evening some distance out of town in a nice restaurant. The founding members got to know each other and exchanged basic information about their settings. The next meeting is planned with presentations. We are looking forward to what this promising user group will become.

Epilogue

This Dev Brunch was a lot of fun and new information and inspiration. As always, it had a lot more content than listed here, this summary is just a best-of. We are looking forward to the next Dev Brunch at the Softwareschneiderei. And as always, we are open for guests and future regulars. Just drop us a notice and we’ll invite you over next time.

	Miq on AI Code Won’t Be for Humans Mu…
	AI Code Won’t Be for… on Impressions of Our Current AI…
	Impressions of Our C… on AI Code Won’t Be for Humans Mu…
	Great software engin… on Digitalization is hard (especi…
	Impressions of Our C… on Computing gets fuzzy again (AI…