Should I test this?

Writing software is hard, writing correct software is even harder. So everything that helps you writing better or more correct software should be used to your advantage. But does every test help? And does every code to be automatically tested? How do I decide what to test and how?

Writing software is hard, writing correct software is even harder. So everything that helps you writing better or more correct software should be used to your advantage. But does every test help? And does every code to be automatically tested? How do I decide what to test and how?
Given a typical web CRUD application, take a look at the following piece of functionality:
We have a model class Element which has a Type type:

class Element {
  ...
  Type type
  ...
}

The view contains a select tag which lets you choose a type:

...
<g:select name="filterByTypeId" from="${types}" value="${filterByType?.id}">
...

And finally in the controller we filter the list of shown elements via the selected type:

...
Type filterByType = Type.get(params['filterByTypeId'])
return [elements: filterByType ? Element.findAllByType(filterByType) : Element.list(), types: Type.list(), filterByType: filterByType]
...

Now ask yourself: would you write an automatic test for this? A functional / acceptance or some unit / integration tests? Would you really test this automatically or just by hand? And how do you decide this?

Dogma

According to TDD you should test everything, there does not exist any code without a test (first). If you really live by TDD the choice is already made: you test this code. But is this pragmatic? Effecient? Productive? And what about the aspects you forgot to test? The order of the types for example. The user wanted to list them lexicographically or by a priority or numbered. What if this part changes and your test is so coupled that you need to change it, too. There are some TDD enthusiasts out there but if you are more pragmatic there are other criteria to help you decide.

Cost

If you look at the code in question and think: how much effort is it to create the test(s)? Or to run the test? If the feedback cycle is too long you lose track of it. I need a test for the controller, this is the easy part. Then I need to test that the view passes the correct parameter and accepts and shows the correct list.
I also can write an acceptance test but this seems like a big gun for a small bird. In our case it heavily depends on the framework how easy or difficult and costly it is to write tests for our filter. What do you have to mock or to simulate? You also have to take the hidden costs into account: how much does it cost to maintain this test? When the requirement changes? When there are more filter criteria? Or if an element can have more than one type?

Value

Another question you can ask is: what is the value for the customer? How much does he need it to work? What is the cost of an error? What happens when the code in question does not work? The value for the customer is not only determined by the functionality it provides. Software can be seen as giving your users capabilities, to enable them. The capability is implemented by two things: implementation (your functionality) and affordance (the UI). The value is determined by both parts. So you hardly can decide on the value of a functionality alone. What if you need to change the UI (in our case the select tag) to increase the value? How does this effect your tests? Does the user reach his goal if the functionality part is broken? What is when the code is correct but it is slow? Or the UI isn’t visible on your user’s screen?

Personal / Team profile

You could decide what and if to test by looking at your past: your personal or team mistakes. Typical problems and bugs you made. Habits you have. You could test more when the (business or technical) domain or the underlying technology is new for you. You could write only few tests when you know the area you work in but more when it is unknown and you need to explore it. You can write more tests if you work in a dynamic language and few in a static language. Or vice versa.

Area / Type of code

You can write tests for every bug you find to prevent regression. You could write tests only for algorithms or data structures. For certain core parts or for interaction with other systems. Or only for (public) interfaces. The area or type of code can help you decide if to test or not.

Visibility

Also you could take a look at how easy it is to spot a bug when manually invoke the code. Do you or your user see the bug immediately? Is it hidden? In our case you should easily see when the list is not filtered or filtered by the wrong criteria. But what if it is just a rounding error or an error where cause and effect is separated by time or location?

Conclusion

Do you have or use additional criteria? How do you decide? I have to admit that I didn’t and I wouldn’t test the above code because I can easily spot problems in the code and try it out by hand if it works (visibility). If the code grows more complex and I cannot easily see the problem (again visibility) or the value (or cost of an error) for the customer is high I would write one.

How to avoid premature optimization

Three simple rules to develop by if you really want to avoid falling into the trap of premature performance optimization.

eternityA common quote linked with Donald E. Knuth of TeX fame is “premature optimization is the root of all evil”. While this might sound a bit harsh, it holds a lot of truth.

Performance as an asset

If you consider software performance as an asset, you can determine its characteristics and derive your decisions about whether to work on it from them. For example, you will discover that while a good performance is paramount, there is a certain threshold when further optimizations are worthless from the asset’s point of view. If you happen to develop a game, you only need to draw as much frames as the monitor can handle. If you process sensor data in real time, there is no need for a prolonged pause between data packets, because computers don’t grow tired.
If you treat performance as an asset, you can also apply a worth to every optimization you want to make and contrast it to the cost of the work you expect to have to invest. This divides the possible optimizations into a group of lucrative and a (probably larger) group of unprofitable investments.

Simple rules

Treating performance as an asset gives you the mental tools to make profound decisions about when and what to optimize. But there are also three simple rules you can apply if you don’t want to write a business plan every time you think “if I just change this line, the code will run much smoother”.

First rule: Don’t

The first rule of performance optimization with a tendency to avoid premature optimization is to just don’t care. You ask yourself if a LinkedList is faster than an ArrayList for a given use case? The short (and ignorant) answer is: both will be fast enough. Is it better to explicitly set all references to null after usage? Why bother when the garbage collector won’t slow you down anyway. Following this rule, you deliberately act dumber than you are with the goal to delay action.

There is a disclaimer, though. There are two different kinds of performance optimization: The first one was referenced in the examples above and deals with actual, but rather local code changes. The second and more important type of performance consideration deals with complexity theory (the one with big O notation) and isn’t measured in milliseconds, but in scalability. You don’t want to be ignorant of the latter type because it will always ruin your runtime behaviour regardless of any optimization of the former type if you implement an exponential or even factorial algorithm. You can be ignorant of “real performance tuning”, but should always be aware of the complexity category your algorithm is living in.

Second rule: Not Yet

There will be a moment when you clearly see an opportunity to improve the runtime performance of your code with just this very small (and very clever) modification. This is when you are ready to break the first rule. Now you should adhere to the second rule: If the cost is as marginal as you say and the gain is profound, go for it – but not now. Performance tuning isn’t a time limited sale that you are only offered right now or never. You can make the same change and reap the same advantages next week or next month. You doubt that you will remember the details? Write an issue or insert some code comment about it. You probably have another task on your todo list that is more important than speeding up the functionality at hand.

The goal of the first rule was to delay action, and that’s the goal of the second rule, too. You’ve probably guessed it already: you avoid premature optimization best by not optimizing at all or at least not optimizing too early. You need to be sure about the value of an optimization before you implement it. As a result of the second rule, your code will be enriched with possibilities for performance improvement. And if you actually need to improve your performance, you can orient yourself along these possibilities or find them then. You want to invest in the tuning business as late as possible, for it is highly speculative.

Third rule: Measure

If you cannot hold on to the first two rules, for example when a real performance issue is reported, you need to take action. But as you are going to invest work into performance optimization, you can as well invest it efficiently. In most applications, there is a 90/10 rule in effect, stating that 90 percent of the runtime is spent in just 10 percent of the code. If you don’t know exactly where your performance bottleneck is, find it using a profiler and remember the 90/10 rule. It’s not efficient nor effective to improve the 90 percent of your code that doesn’t matter in regard to performance.

If you have identified the piece of code that most likely slows your application down, you should remember the second part of the third rule: Never make performance optimizations without a meaningful benchmark that you can run beforehand and afterwards. All to often, the clever performance trick you remember from long ago is actually hurting your performance now. A meaningful benchmark will tell you if you did good. To make a benchmark “meaningful”, you really need to read up on benchmarking in your target platform. In Java, for example, you need to know about proper warm-up of the VM and perform enough cycles to not include one-time effects in your numbers. If you’ve written such a benchmark, keep it! Try to fully automate it and let it be the cornerstone of your growing performance test suite. There might come the day when this test/benchmark tells you that your formerly clever optimization is now obsolete due to internal platform changes.

Conclusion

If you follow these three simple rules, you won’t automatically write high performance software. But you will spend your valuable time fixing real performance issues instead of tinkering with your code to no effect. You definitely won’t optimize prematurely and steer clear of this “root of all evil”.

Object Calisthenics: Change the way you think

Some time ago I spoke with my colleague about skill sharpening and training the brain to come up with new solutions. He proposed a two hour session at the weekend implementing a small game using object calisthenics.

Rules

The rules are described in The ThoughtWorks Anthology book. Here is the list for quick reference.

  1. Use only one level of indentation per method.
  2. Don’t use the else keyword.
  3. Wrap all primitives and strings.
  4. Use only one dot per line.
  5. Don’t abbreviate.
  6. Keep all entities small.
  7. Don’use any classes with more than two instance variables.
  8. Use first-class collections.
  9. Don’t use any getters/setters/properties.

Most of the rules seemed simple enough. Rules 2 and 5 are standard in Softwareschneiderei, 1, 4, 6 and 8 are stricter versions of common sense, 3 is a tedious object wrapping. The rules I was anxious about were 7 and 9. To increase the learning effect, I added an extra rule to the list that is critical in real life programming:

  1.   Write tests for your code.

It doesn’t matter whether to write test first, test after or even test driven. Only then is the code “value added”.

Experiences

The game was minesweeper. It contains a nice mix of algorithms, data structures and UI. I concentrated the efforts on the algorithmic part. My first step was to analyse and create the needed data structures.

  • The smallest unit is the cell.
  • A cell can be either hidden or revealed, have a mine or be empty.
  • The game field contains such cells in rows and columns.
  • The position of a cell in a field is defined by its coordinate that contains the x and y position.

To associate anything with coordinates the coordinates had to be comparable to each other. Rule 9 forbids exposure of internal state, so the Coordinate class got its equals() and hashCode(). Only the creator of the coordinate had the knowledge about the number of dimensions and the values of the positions. Even the tests had no access to the inner state and tested only those two methods.

Since the revealed flag concept and a mine flag concept had similar properties, I decided not to track cells but to track their flags. Through this architectural decision, I had a field with two flag containers, one for revealed cells and one for cells with mines. An additional benefit was that it was enough to put only the coordinate into the container to mark a cell as a mine.

The next step was to link the parts together and add some behaviour. Setting a mine, then revealing a cell and obtaining the number of mines also. Setting a mine and marking the cell as revealed is a simple task with the containers. Testing that the revealed cell contained the mine was more tricky. To achieve that, the reveal method got an additional parameter, a closure with a hasMine parameter.

public void reveal(final Coordinate coordinate, final CellContainerVisitor revealedCellsVisitor) {
    revealedCells.mark(coordinate);
    visit(coordinate, revealedCellsVisitor);
}

private void visit(final Coordinate coordinate, final CellContainerVisitor revealedCellsVisitor) {
    revealedCellsVisitor.visit(coordinate, hasMineAt(coordinate));
}

@Test
public void containsMines() {
    final CellContainer target = new CellContainer();
    target.placeMineAt(someCoordinate());

    final List<Coordinate> mineCells = new ArrayList<Coordinate>();
    target.reveal(someCoordinate(), (coordinate, hasMine) -> {
        if (hasMine.equals(new HasMine(true))) {
           mineCells.add(coordinate);
        }
    });

    assertThat(mineCells, hasSize(1));
    assertThat(mineCells, contains(someCoordinate()));
}

The next game rule consumed the rest of the session: calculating the number of mines in the neighborhood. The main obstacle was to compute the coordinate of the neighbour. To do this it is necessary to add an offset to a position in a coordinate without exposing its internal structure. In the end I reverted to using more closures.

Conclusion

To achieve my goal I had to reverse the order in which I normally develop business logic: Rule 9 seems to support top-down approach: The interfaces of domain objects are nearly completely dominated by the way they are used by their containers.

Most of the time in this two hour session was spent staring at the screen and to think how to write readable code and readable tests without exposing internal details of the objects. Time well spent.

Guide to better Unit Tests: Focused Tests

Every now and then we stumble over unit tests with much setup and numerous checked aspects. These tests easily become a maintenance nightmare. While J.B. Rainsberger advocates getting rid of integration tests in his somewhat lengthy but very insightful talk at Agile 2009 he gives some advice I would like to use as a guide to better unit tests. His goal is basic correctness achieved by the means of what he aptly calls focused tests. Focused tests test exactly one interesting behaviour.

The proposed way to write these focused tests is to look at three different topics for each unit under test:

  1. Interactions (Do I ask my collaborators the right questions?)
  2. Do I handle all answers correctly?
  3. Do I answer questions correctly?

Conventional unit testing emphasizes on the third topic which works fine for leave classes that do not need collaborators. Usually, your programming world is not as simple, so you need mocking and stubbing to check all these aspects without turning your unit test into some large integration test that is slow to run and potentially difficult to maintain.
I will try to show you the approach using a simple and admittedly a bit contrived example. Hopefully, it illustrates Rainsberger’s technique good enough. Assume the IllustrationController below is our unit under test:

public class IllustrationController {
    private final PermissionService permissionService;
    private final IllustrationAction action;

    public IllustrationController(PermissionService permissionService, IllustrationAction action) {
        super();
        this.permissionService = permissionService;
        this.action = action;
    }

/**
* @return true, if the action was executed, false otherwise
*/
    public boolean performIfAllowed(Role r) {
        if (!permissionService.allowed(r)) {
            return false;
        }
        this.action.execute();
        return true;
    }
}

It has two collaborators: PermissionService and IllustrationAction. The first thing to check is:

Do I ask my collaborators the right questions?

In this case this is quite simple to answer, as we only have a few cases: Do we pass the right role to the PermissionService? This results in tests like below:

@Test
public void asksForPermissionWithCorrectRole() throws Exception {
PermissionService ps = mock(PermissionService.class);
IllustrationAction action = mock(IllustrationAction.class);

IllustrationController ic = new IllustrationController(ps, action);
ic.performIfAllowed(Role.User);
// this question needs a test in PermissionService
verify(ps, atLeastOnce()).allowed(Role.User);
ic.performIfAllowed(Role.Admin);
// this question needs a test in PermissionService
verify(ps, atLeastOnce()).allowed(Role.Admin);
}

Do I handle all answers correctly?

In our example only the PermissionService provides two different answers, so we can easily test that:

@Test
public void interactsWithActionBecausePermitted() {
PermissionService ps = mock(PermissionService.class);
IllustrationAction action = mock(IllustrationAction.class);
// there has to be a case when PermissionService returns true, so write a test for it!
when(ps.allowed(any(Role.class))).thenReturn(true);

IllustrationController ic = new IllustrationController(ps, action);
ic.performIfAllowed(Role.Admin);

verify(ps, atLeastOnce()).allowed(any(Role.class));
verify(action, times(1)).execute();
}

@Test
public void noActionInteractionBecauseForbidden() {
PermissionService ps = mock(PermissionService.class);
IllustrationAction action = mock(IllustrationAction.class);
// there has to be a case when PermissionService returns false, so write a test for it!
when(ps.allowed(any(Role.class))).thenReturn(false);

IllustrationController ic = new IllustrationController(ps, action);
ic.performIfAllowed(Role.User);

verify(ps, atLeastOnce()).allowed(any(Role.class));
verify(action, never()).execute();
}

Note here, that not only return values are answers but also exceptions. If our action may throw exceptions on execution we can handle, we have to test that too!

Do I answer questions correctly?

Our controller answers the question, if the operation was performed or not by returning a boolean from its performIfAllowed()-method so lets check that:

@Test
public void handlesForbiddenExecution() throws Exception {
PermissionService ps = mock(PermissionService.class);
IllustrationAction action = mock(IllustrationAction.class);
when(ps.allowed(any(Role.class))).thenReturn(false);

IllustrationController ic = new IllustrationController(ps, action);
assertFalse("Perform returned success even though it was forbidden.", ic.performIfAllowed(Role.User));
}

@Test
public void handlesSuccessfulExecution() throws Exception {
PermissionService ps = mock(PermissionService.class);
IllustrationAction action = mock(IllustrationAction.class);
when(ps.allowed(any(Role.class))).thenReturn(true);

IllustrationController ic = new IllustrationController(ps, action);
assertTrue("Perform returned failure even though it was allowed.", ic.performIfAllowed(Role.Admin));
}

Conclusion
What we are doing here is essentially splitting different aspects of interesting behaviour in their own tests. The first two questions define the contract between our unit under test and its collaborators. For every question we ask and therefore stub using our mocking framework there has to be a test, that verifies that this question is answered like we expect it. If we handle all the answers correctly, our interaction is deemed to be correct, too. And finally, if our class implements its class contract correctly by answering the third question our clients also know what to expect and can rely on us.

Because each test focuses on only one aspect it tends to be simple and should only break if that aspect changes. In many cases these kind of tests can make your integration tests obsolete like Rainsberger states. I think there are cases in modern frameworks like grails where you do not want to mock all the framework magic because it is too easy to make wrong assumptions about the behaviour of the framework. So imho integration tests provide some additional value there because the behaviour of the platform stays part of the tests without being tests explicitly.

Automatic deployment of (Grails) applications

What was your most embarrassing moment in your career as a software engineer? Mine was when I deployed an application to production and it didn’t even start. Stop using manual deployment and learn how to automate your (Grails) deployment

What was your most embarrassing moment in your career as a software engineer? Mine was when I deployed an application to production and it didn’t even start.

Early in my career deploying an application usually involved a fair bunch of manual steps. Logging in to a remote server via ssh and executing various commands. After a while repetitive steps were bundled in shell scripts. But mistakes happened. That’s normal. The solution is to automate as much as we can. So here are the steps to automatic deployment happiness.

Build

One of the oldest requirements for software development mentioned in The Joel Test is that you can build your app in one step. With Grails that’s easy just create a build file (we use Apache Ant here but others will do) in which you call grails clean, grails test and then grails war:

<project name="my_project" default="test" basedir=".">
  <property name="grails" value="${grails.home}/bin/grails"/>
  
  <target name="-call-grails">
    <chmod file="${grails}" perm="u+x"/>
    <exec dir="${basedir}" executable="${grails}" failonerror="true">
      <arg value="${grails.task}"/><arg value="${grails.file.path}"/>
      <env key="GRAILS_HOME" value="${grails.home}"/>
    </exec>
  </target>
  
  <target name="-call-grails-without-filepath">
    <chmod file="${grails}" perm="u+x"/>
    <exec dir="${basedir}" executable="${grails}" failonerror="true">
      <arg value="${grails.task}"/><env key="GRAILS_HOME" value="${grails.home}"/>
    </exec>
  </target>

  <target name="clean" description="--> Cleans a Grails application">
    <antcall target="-call-grails-without-filepath">
      <param name="grails.task" value="clean"/>
    </antcall>
  </target>
  
  <target name="test" description="--> Run a Grails applications tests">
    <chmod file="${grails}" perm="u+x"/>
    <exec dir="${basedir}" executable="${grails}" failonerror="true">
      <arg value="test-app"/>
      <arg value="-echoOut"/>
      <arg value="-echoErr"/>
      <arg value="unit:"/>
      <arg value="integration:"/>
      <env key="GRAILS_HOME" value="${grails.home}"/>
    </exec>
  </target>

  <target name="war" description="--> Creates a WAR of a Grails application">
    <property name="build.for" value="production"/>
    <property name="build.war" value="${artifact.name}"/>
    <chmod file="${grails}" perm="u+x"/>
    <exec dir="${basedir}" executable="${grails}" failonerror="true">
      <arg value="-Dgrails.env=${build.for}"/><arg value="war"/><arg value="${target.directory}/${build.war}"/>
      <env key="GRAILS_HOME" value="${grails.home}"/>
    </exec>
  </target>
  
</project>

Here we call Grails via the shell scripts but you can also use the Grails ant task and generate a starting build file with

grails integrate-with --ant

and modify it accordingly.

Note that we specify the environment for building the war because we want to build two wars: one for production and one for our functional tests. The environment for the functional tests mimic the deployment environment as close as possible but in practice you have little differences. This can be things like having no database cluster or no smtp.
Now we can put all this into our continuous integration tool Jenkins and every time a checkin is made out Grails application is built.

Test

Unit and integration tests are already run when building and packaging. But we also have functional tests which deploy to a local Tomcat and test against it. Here we fetch the test war of the last successful build from our CI:

<target name="functional-test" description="--> Run functional tests">
  <mkdir dir="${target.base.directory}"/>
  <antcall target="-fetch-file">
    <param name="fetch.from" value="${jenkins.base.url}/job/${jenkins.job.name}/lastSuccessfulBuild/artifact/_artifacts/${test.artifact.name}"/>
    <param name="fetch.to" value="${target.base.directory}/${test.artifact.name}"/>
  </antcall>
  <antcall target="-run-tomcat">
    <param name="tomcat.command.option" value="stop"/>
  </antcall>
  <copy file="${target.base.directory}/${test.artifact.name}" tofile="${tomcat.webapp.dir}/${artifact.name}"/>
  <antcall target="-run-tomcat">
    <param name="tomcat.command.option" value="start"/>
  </antcall>
  <chmod file="${grails}" perm="u+x"/>
  <exec dir="${basedir}" executable="${grails}" failonerror="true">
    <arg value="-Dselenium.url=http://localhost:8080/${product.name}/"/>
    <arg value="test-app"/>
    <arg value="-functional"/>
    <arg value="-baseUrl=http://localhost:8080/${product.name}/"/>
    <env key="GRAILS_HOME" value="${grails.home}"/>
  </exec>
</target>

Stopping and starting Tomcat and deploying our application war in between fixes the perm gen space errors which are thrown after a few hot deployments. The baseUrl and selenium.url parameters tell the functional plugin to look at an external running Tomcat. When you omit them they start the Tomcat and Grails application themselves in their process.

Release

Now all tests passed and you are ready to deploy. So you fetch the last build … but wait! What happens if you have to redeploy and in between new builds happened in the ci? To prevent this we introduce a step before deployment: a release. This step just copies the artifacts from the last build and gives them the correct version. It also fetches the lists of issues fixed from our issue tracker (Jira) for this version as a PDF. These lists can be sent to the customer after a successful deployment.

Deploy

After releasing we can now deploy. This means fetching the war from the release job in our ci server and copying it to the target server. Then the procedure is similar to the functional test one with some slight but important differences. First we make a backup of the old war in case anything goes wrong and we have to rollback. Second we also copy the context.xml file which Tomcat needs for the JNDI configuration. Note that we don’t need to copy over local data files like PDF reports or serach indexes which were produced by our application. These lie outside our web application root.

<target name="deploy">
  <antcall target="-fetch-artifacts"/>

  <scp file="${production.war}" todir="${target.server.username}@${target.server}:${target.server.dir}" trust="true"/>
  <scp file="${target.server}/context.xml" todir="${target.server.username}@${target.server}:${target.server.dir}/${production.config}" trust="true"/>

  <antcall target="-run-tomcat-remotely"><param name="tomcat.command.option" value="stop"/></antcall>

  <antcall target="-copy-file-remotely">
    <param name="remote.file" value="${tomcat.webapps.dir}/${production.war}"/>
    <param name="remote.tofile" value="${tomcat.webapps.dir}/${production.war}.bak"/>
  </antcall>
  <antcall target="-copy-file-remotely">
    <param name="remote.file" value="${target.server.dir}/${production.war}"/>
    <param name="remote.tofile" value="${tomcat.webapps.dir}/${production.war}"/>
  </antcall>
  <antcall target="-copy-file-remotely">
    <param name="remote.file" value="${target.server.dir}/${production.config}"/>
    <param name="remote.tofile" value="${tomcat.conf.dir}/Catalina/localhost/${production.config}"/>
  </antcall>

  <antcall target="-run-tomcat-remotely"><param name="tomcat.command.option" value="start"/></antcall>
</target>

Different Environments: Staging and Production

If you look closely at the deployment script you notice that uses the context.xml file from a directory named after the target server. In practice you have multiple deployment targets not just one. At the very least you have what we call a staging server. This server is used for testing the deployment and the deployed application before interrupting or corrupting the production system. It can even be used to publish a pre release version for the customer to try. We use a seperate job in our ci server for this. We separate the configurations needed for the different environments in directories named after the target server. What you shouldn’t do is to include all those configurations in your development configurations. You don’t want to corrupt a production application when using the staging one or when your tests run or even when you are developing. So keep configurations needed for the deployment environment separate and separate from each other.

Celebrate

Now you can deploy over and over again with just one click. This is something to celebrate. No more headaches, no more bitten finger nails. But nevertheless you should take care when you access a production system even it is automatically. Something you didn’t foresee in your process could go wrong or you could make a mistake when you try out the application via the browser. Since we need to be aware of this responsibility everybody who interacts with a production system has to wear our cowboy hats. This is a conscious step to remind oneself to take care and also it reminds everybody else that you shouldn’t disturb someone interacting with a production system. So don’t mess with the cowboy!

You probably forget too much, too soon and way too definite

Your stored data is probably worth a lot. To rule out accidental removal, you should interdict the delete operation for your application. Here’s why and how you can implement it.

eraser_1If you felt spoken to by the blog title, you might relax a bit: I didn’t mean you, but your application. And I don’t suggest that your application forgets things, but rather removes them deliberately. My point is that it shouldn’t be able to do so.

A disaster waiting to happen

Try to imagine a child that is given a sharp scissors to play with by its parents. It runs around the house, scissors in hand, cutting away things here and there. Inevitably, as mandated by Murphy’s law, it will stumble and fall, probably hurting itself in the process. This scenario is a disaster waiting to happen. It is a perfect analogy of your application as long as it is able to perform the delete operation.

A safe environment

Now imagine an application that is forbidden to delete data. The database user used by the program is forbidden to issue the SQL DELETE command. This is like the child in the analogy before, but with the scissors taken away. It will leave a mess behind while playing that needs an adult to clean up periodically and it will fall down, but it won’t stab itself. If you can run your application in such a restricted environment, you can guarantee a “no-vanish” data safety: No data element that is stored will disappear ever.

Data safety

In case you wonder, this isn’t the maximum data safety level you can (and perhaps should) have. There is still the danger of accidental alteration, where existing data is replaced or overwritten by other data. To achieve the highest “no-loss” data safety level, you need to have a journaling database system that tracks every change ever made in an ever-growing transaction log. If that sounds just like a version control system to you, it’s probably because it essentially will be such a thing.
But “no-vanish” data safety is the first and most important step on the data safety ladder. And it’s easy to accomplish if you incorporate it into your system right from the start, implementing everything around the concept that no deletion will occur on behalf of the system.

Why data safety?

But why should you attempt to adhere to such a restriction? The short answer is: because the data in your system is worth it. We recently determined the immediate monetary worth of primary data entries in one of our systems and found out that every entry is worth several thousand euros. And we have several hundred entries in this system alone. So accidentally deleting two or three entries in this system is equivalent to wrecking your car. Who wouldn’t buy a car when the manufacturer guarantees that wreckages cannot happen, by design? That’s what data safety tries to achieve: Giving a guarantee that no matter how badly the developers wreck their code, the data will not be affected (at least to a degree, depending on the safety level).

No deletion

The best way to give a guarantee and hold onto it is to eliminate the root cause of all risks. In our digital world, this is surprisingly easy to accomplish in theory: If you don’t want to lose data (by accidental removal), prohibit usage of the delete operation on the lowest layer (probably the database). If your developers still try to delete things, they only get some kind of runtime error and their application will likely crash, but the data remains intact.

Implementation using a RDBMS

If you are using a relational database system, you should be aware that “no-vanish” safety comes with a cost. Every time you fetch a list of something from your database, you need to add the constraint that the result must only contain “non-obsolete” entries. Every row in your main tables will adopt some sort of “obsolete” column, housing a boolean flag that indicates that this row was marked as deleted by the application. Remember, you cannot delete a row, you can only mark it deleted using your own mechanisms.
The tables in your database holding “derived data”, like join tables or data only referenced by foreign keys, don’t necessarily need the obsolete flag, but will clutter up over time. That’s not a problem as long as nobody thinks that all entries in these tables are “living data”. The entries that are referenced by obsolete entries in the main tables will simply be forgotten because they are inaccessible by normal means of data retrieval.

The dedicated hitman

Using this approach, your database size will grow constantly and never shrink. There will come the moment when you really want to compact the whole thing and weed out the unused data. Don’t give the delete right back to your application’s database user! You basically don’t trust your application with this (remember the child analogy). You want this job done by a professional. You need a dedicated hitman. Create a second database user that doesn’t has the right to alter the data, but can delete it. Now run a separate job (another application) on your database within this new context and remove everything you want removed. The key here is to separate your normal application and the removal task as far as possible to prohibit accidental usage. If you think you’ve heard this concept somewhere before: It basically boils down to a “garbage collector”. The term “hitman” is just a dramatization of the bleak reality, trying to remind you to be very careful what data you really want to assasinate.

Implementation using a graph database

If this seems like a lot of effort to you, perhaps the approach used by graph databases suits you better. In a graph database, you group your data using “graph connections” or “edges”. If you have a node “persons” in your database, every person node in the database will be associated to this node using such a connection. If you want to remove a person node (without throwing the data away), you remove the edge to the “persons” node and add a new edge to the “deleted_persons” node. You can probably see how this will be easier to handle in your application code than ensuring that the obsolete flag is considered everywhere.

Conclusion

This isn’t a bashing of relational database systems, and it isn’t a praise of graph databases. In fact, the concept of “no deletion” is agnostic towards your actual persistence technology. It’s a requirement to ensure some basic data safety level and a great way to guarantee to your customer that his business assets are safe with your system.

If you have thoughts on this topic, don’t hesitate to share them!

Test your migrations

Do you trust your database migrations?

An evolving project that changes its persistent data structure can require a transformation of already existing content into the new form. To achieve this goal in our grails projects we use a grails database migration plugin. This plugin allows us to apply changesets to the database and keep track of its current state.

The syntax of the DSL for groovy database migrations is easy to read. This can trick you into the assumption that everything that looks good, compiles and runs without errors is OK. Of course it is not. Here is an example:

changeSet(author: 'vasili', id: 'copies messages to archive') {
  grailsChange {
    change {
      sql.eachRow("SELECT MESSAGES.ID, MESSAGES.CONTENT, "
                + "MESSAGES.DATE_SENT FROM MESSAGES WHERE "
                + "MESSAGES.DATE_SENT > to_date('2011-01-01 00:00', "
                + "'YYYY-MM-DD HH24:MI:SS')") { row ->
        sql.execute("INSERT INTO MESSAGES_ARCHIVE(ID, CONTENT, DATE_SENT) "
                  + "VALUES(${row.id}, ${row.content}, ${row.date_sent})")
      }
    }
    change {
      sql.eachRow("SELECT MESSAGES.ID, MESSAGES.CONTENT, "
                + "MESSAGES.DATE_SENT FROM MESSAGES WHERE "
                + "MESSAGES.DATE_SENT > to_date('2012-01-01 00:00', "
                + "'YYYY-MM-DD HH24:MI:SS')") { row ->
        sql.execute("INSERT INTO MESSAGES_ARCHIVE(ID, CONTENT, DATE_SENT) "
                + "VALUES(${row.id}, ${row.content}, ${row.date_sent})")
      }
    }
  }
}

Here you see two change closures that differ only by the year in the SQL where clause. What do you think will happen with your database when this migration is applied? The answer is: only changes from the year 2012 will be found in the destination table. The assumption that when there is one change closure in the grailsChange block there can also be two changes in it is, while compilable and runnable, wrong. Loking at the documentation you will see that it shows only one change block in the example code. When you divide the migration into multiple parts, each of them working on their own change, everything will work as expected.

Currently there is no safety net like unit tests for database migrations. Every assumption you make must be tested manually with some dummy test data.

TDD myths: the problems

I take a look at some (in my experience) problems/misconceptions with TDD:
100% code coverage is enough, Debugging is not needed, Design for testability, You are faster than without tests

100% code coverage is enough

Code coverage seems to be a bad indicator for the quality of the tests. Take the following code as an example:

public void testEmptySum() {
  assertEquals(0, sum());
}

public void testSumOfMultipleNumbers() {
  assertEquals(5, sum(2, 3));
}

Now take a look at the implementation:

public int sum(int...numbers) {
  if (numbers.length == 0) {
    return 0;
  }
  return 5;
}

Baby steps in TDD could lead you to this implementation. It has 100% code coverage and all tests are green. But the implementation isn’t finished at all. Our experiment where we investigated how much tests communicate the intend of the code showed flaws in metrics like code coverage.

Debugging is not needed

One promise of TDD or tests in general is that you can neglect debugging. Even abandon it. In my experience when a test goes red (especially an integration test) you sometimes need to fire up the debugger. The debugger helps you to step through code and see the actual state of the system at that time. Tests treat code as a black box, an input results in an output. But what happens in between? How much do you want to couple your tests to your actual implementation steps? Do we need the tests to cover this aspect of software development? Maybe something along the lines as shown in Inventing on principle where the computer shows you the immediate steps your code takes could replace debugging but tests alone cannot do it.

Design for testability

A noble goal. But are tests your primary client? No. Other code is. Design for maintainability would be better. You will need to change your code, fix it, introduce new features, etc. Don’t get me wrong: You need tests and you need testability. But how much code do you write specifically for your tests? How much flexibility do you introduce because of your tests? What patterns do you use just because your tests need them? It’s like YAGNI for code exposure for tests. Code specifically written only for tests couples your code to your tests. Only things that need to be coupled should be. Is the choice of the underlying data structure important? Couple it, test it. If it isn’t, don’t expose it, don’t write a getter. Don’t break the information hiding principle if you don’t need to. If you couple your tests too much to your code every little change breaks your tests. This hinders maintenance. The important and difficult design question is: what is important. Test this.

You are faster than without tests

Some TDD practitioners claim that they are faster with TDD than without tests because the bugs and problems in your code will overwhelm you after a certain time. So with a certain level of complexity you are going faster with TDD. But where is this level? In my experience writing code without tests is 3x-4x faster than with TDD. For small applications. There are entire communities where many applications are written without or with only a few tests. But I wouldn’t write a large application without tests but at least my feeling is that in many cases I go much slower. Cases where I feel faster are specification heavy. Like parsing or writing formats, designing an algorithm or implementing a scientific formula. So the call is open on this one. What are your experiences? Do you feel slowed down by TDD?

Thoughts about TDD

Thoughts and links about test driven development

First a disclaimer: I think tests are a hallmark for professional software development, I like to write tests before the implementation but that’s not always easy or simple (for the difference please refer to Simple made easy). I find it hard to grasp test driven development (TDD) though. The difference between test first and test driven lies in the intention: in both cases tests are written before any implementation code but in TDD the tests drive the design of your implementation.

The problem with opinions of TDD is there are mostly extreme positions: some think “TDD is the (next) holy grail” or the ones which dismissed it. Though reading between the lines there are great discussions about how to do it and what problems arise. Many people (me included) are really trying to get value from TDD. Testing should be fun.
One way in letting the tests drive the way you develop is proposed by Uncle Bob: transformation priority premise. He proposes a list of transformations which introduce new or replace existing constructs like replacing a constant by a variable or adding more logic and gives them a priority. Only if you cannot use a high priority transformation to get the test to pass you look at a transformation with a lower priority.
But how do you determine what you should test next or even which is the first test?
Taking the typical Conway’s game of life kata as an example one thing struck me: I could only get the TDD to work smoothly when I started with the data structure. But why that? Naturally I start with the algorithm (in this case the rules) and write the first test for it. But upon further inspection of the problem and deeper (domain) knowledge it seems the data structure is way more important for solving this kata. So you need to know where the journey goes along beforehand, not every step you will take but the big picture: first the data structure, then the rules in this example. Maybe you should start with the integrations or the functional tests and break them down into units.
What are your experiences using TDD? Do you use or want to use TDD?

Clean Code Developer at your fingertips

You’ve probably already heard about the Clean Code Developer initiative. We’re donating a full spectrum of mousepad designs for your educational support.

We are participants in the Clean Code Developer (CCD) movement. This initiative provides a way to perpetually learn, train, reflect and act on the most important topics of today’s software development by formulating a value system and a learning path. The learning path is subdivided in different grades, associated with colors. Every Clean Code Developer progresses continually through the grades, focussing on the principles and practices of the current grade.

If you want a tongue-in-cheek explanation of what the Clean Code Developer is in one sentence: It’s a sight-seeing tour to the most prominent topics every professional software developer should know. But other than your usual tourist rip-off, you can just stay seated and enjoy another round without ever paying anything except attention.

Visualize it

An important aspect of learning and deliberate practice is proper visualization. We invest a lot of work at our workplace, our software and the interaction with our customers to make things visible. When we reflected on our Clean Code Developer practice, we knew that it lacked visualization.

The proposed equipment for a Clean Code Developer is a desktop background picture, a mousepad with an image of all grades at once and some rubber wristbands in the colors of the grades. The wristbands serve as a reminder and a self-assessment tool. The desktop background picture is nice, but only visible if we don’t perform actual work. This let us concentrate on the mousepad.

Duplicate if necessary

The mousepad is the most prominent “advertising” space on the typical work desktop. We want to advertise the content of our current Clean Code Developer grade to ourself. The combination of these two thoughts is not one mousepad, but one for every grade. Imagine six mousepads in the colors of the grades, displaying your currently most important topics right under your fingertips.

We liked the idea so much that we worked on it. The result is a collection of mousepads for every Clean Code Developer to enjoy.

Iterative design

It took us several full cycles of planning, design, layout and proof-reading to have the first version of mousepads produced. It took only a few hours of real-world testing to start the second iteration to further improve the design. Right now, we are on the third iteration. The first iteration had the five colored CCD grades printed on real mousepads. The second iteration added the mousepad for the white grade and a little stand-up display for the initial black grade. The third iteration incorporated the official Clean Code Developer logo, the website URL and improved some details.

Here are some promotional photos of the five first-iteration mousepads:

This slideshow requires JavaScript.

As you can see, we chose to print ultra-slim mousepads to test if it’s feasible to use them stacked all at once (it isn’t, your mileage may vary) or use them even if you aren’t used to mousepads at all (it depends, really). You might want to print the images onto the mousepad you prefer best.

Do it yourself

Yes, you’ve read it right. We are donating the mousepad images back to the community. You can download everything right here:

All documents are bare of any company logo or other advertising and free for your constructive usage. There is only one catch really: the documents are in german language. This might not be apparent at first because we really like the original english technical terms, but some content might need translation for non-german speakers. If you are interested to produce an all-english version, drop us a line.

Acknowledgements

These mousepads wouldn’t exist without the help and inspiration of many co-workers. First of all, the founders of the Clean Code Developer movement, Ralf Westphal and Stefan Lieser, provided all the content of the mousepads. Without their groundbreaking work, we probably wouldn’t have thought of this. The design and production is owed to Hannegret Lindner from the Hannafaktur, a small graphic design agency. We admire her endurance with our iterative approach. And finally, the initial inspiration sparked in a creative discussion with Eric Wolf and Benjamin Trautwein from ABAS Software AG.

It’s your turn now

We are very curious about your story, photo or action still with the mousepads (or the little stand-up display). You can also just share your thoughts about the whole idea or submit an improvement. We’d love to hear from you.