Continuous Learning – Page 7

Snowflakes are a bad sign

Snowflake servers are brittle and expensive. Treating hardware like cattle instead of pets is one strategy to overcome the snowflake syndrome. Here are some strategies to foster this mindset.

First, allow me a bad joke: If you enter your server room and find real snowflakes, it might be a sign that your air conditioning is over-ambitious. But even if you just enter your server room, you probably see some snowflakes, but in the metaphorical sense.

Snowflake servers

Snowflakes are servers with an unique layout. I cannot say it better than Martin Fowler two years ago in his Bliki posting SnowflakeServer, but I’m trying to add some insights and more current tools. The term probably originates in the motto that everybody is a “precious unique snowflake”. This holds true for humans and animals, but not for machines. Let’s examine how a snowflake is born. Imagine that in the beginning, all servers are the same: standard hardware, a default operating system and nothing more. You pick one server to host a special application and adjust the hardware accordingly. Now you already have an hardware snowflake – not the worst thing, but you better document your rationale behind the adjustment in an accessible way – a wiki page specifically for that server perhaps. Because sooner or later, that machine will fail (or become hopelessly obsolete) and needs to be replaced – with adequate hardware. Without your documentation, you’ll have to remember why the old machine had that specific layout – and if it was sufficient. I’ve seen the “ancient server” anti-pattern much too often: A dusted machine, buzzing like an asthmatic pensioner in the last corner of the server room, and nobody was allowed near. Because there are no spare parts (VESA local bus isn’t supported anymore), if one part fails, the whole system is doomed – operating system and software included. Entire organizations rely on the readiness for duty of one hardware assembly – and almost always a crude one.

Server as cattle

The ancient server happens more likely when you treat your servers like pets. This is the crucial mental switch you’ll have to make: servers are cattle, not pets. They have numbers, not names. They can be monitored, upgraded and fostered, but at the end of the day, they serve a clearly defined business case and deserve no emotional investment of the owner. If a pet gets hurt, you take it to the veterinary and cure it. If cattle gets sick, you call the veterinary to make sure it’s not contagious and then replace the affected individuals – to cure them would be more expensive. Pets live as long as they can, cattle has a date of expiry. And our cattle (servers) really isn’t sentient, so stop treating it like pets.

Strategies to run a ranch

Our current answer to make the transition from pet zoo to cattle ranch without significantly increasing the amount of metal in our server room can be boiled down to three strategies:

Virtualize the logical machines. Instead of working on “real metal machines”, more and more of our services run inside virtual machines. This allows for a clearer separation of concerns (one duty per machine) and keeps the emotional commitment towards the machine low. Currently, we use VirtualBox and Docker for this task. Both are easy to set up and fulfill their task well.
Remove the names from real metal machines. We really number our real machines now. Giving clever names to virtual machines is still possible, but not necessary: they are probably only accessed using DNS aliases that specify their use, like “projectX-database” or “projectY-webserver”. We even choose the computer cases for our machines accordingly to separate the pets (unique cases) from cattle (uniform cases).
Specify the machine. The virtualized hardware must be described and explained (e.g. why this particular machine needs twice the normal RAM ration). Currently, we use Vagrant to specify the hardware and operating system of our virtual machines. The specifications are stored in a version controlled repository, so there is a place where most of our server infrastructure is described in a deployable fashion. Even more, all necessary third-party software products are specified, too. Imagine a todo list of what to install and prepare, like the one you’ve handed over to your admin in the past, but automatically executable. We currently use Ansible for our configuration management because it has very low requirements for the target platform itself and has a low learning curve.

Applying these three strategies, every (logical) machine in our server room should be reproduceable. They are still individuals, specifically tailored for their jobs, but completely specified and virtualized. The real metal machines only run the bare minimum of software necessary to host the logical machines. None of the machines promote emotional attachment – they are tools for their job.

Data is snow

One important insight is that persistent data will turn your machine into a snowflake over time (we use the term as a verb: “data will snowflake your machine”). You will become emotionally and financially attached to this data – otherwise, there is no need to persist it in the first place. We don’t have a panacea here yet. You probably want to use a database and a sophisticated backup strategy here. Just make sure that the presence of precious data on it doesn’t obscure your stance towards the machine. You want to keep the data and still be able to throw the machine away.

Don’t stop at machines

We are software developers, so we cannot deny that the concept of snowflaking is very helpful for our own projects, too. Every dependency that we can bring with us during deployment (called “self-containment” or “batteries included” in our slang) is one less thing of “snowflaking” the target machine. Every piece of infrastructure (real, virtualized or purely conceptual) we implicitly rely on (like valid certificates, SSH keys or passwords and database locations) will snowflake the target machine and should be treated accordingly: documented, specified and automated. If you hot-fix a production server, it’s definitely a huge snowflaking action that needs to be at least carefully documented. You can’t avoid snowflaking completely, but strive to mimize the manual amount of it and then sanitize the automated part.

Snowflaking is a concept

We’ve found the term of “snowflaking” very useful to transport the necessity and value in documenting, specifying and automating everything that doesn’t happen on a developer machine (and even there, the build process is fully automated). Snowflaked enviroments tend to be expensive in maintainance and brittle in operations. The effort to mitigate the effects of snowflaking pays off very soon and is highly reuseable. But even more powerful is the change in the mindset as soon as the concept of “snowflaking” is understood. It’s a short term for a broad range of strategies and values/beliefs. It’s a powerful and scalable concept.

We’d love to hear your experiences

You’ve probably experimented with various tools and concepts to manage your servers, too. What were your experiences and insights? Add a comment below, we are looking forward to your input.

A coder’s manifesto

A personal manifesto about what I value in developing software and what I think needs to change how we develop software.

Remember the Agile manifesto? What was the most important principle behind it?

User value first

Our highest priority is to satisfy the customer
through early and continuous delivery
of valuable software. – The agile manifesto

If it doesn’t benefit the user it should not be done. This can be a new feature, a better user interface, a clearer wording, better performance, robustness, … Not all improvements have immediate value but they must have a value at some time. If you look at the craftsman project priorities timely delivery has the most user value for me. (Personal footnote: I think this is where the software craftsmanship movement sets the wrong focus: quality is not the most important thing, user value is). But how do you know what the user might need?

Communication is key

Communication in all parts of software development is a must. You need to talk to your users, your fellow developers and other project partners. What often is neglected is the communication part of the code and the documentation. The primary measure of good code is how well it communicates its intent. How clear it is. Many measure code quality with things like testability, low coupling, coverage, metrics. But clarity and communication are by far the most important. If you can understand what the code does and how and why, you are able to change it. I personally believe that not only the statements but also the formatting of the code and the individual expression, the style, is important. Just as in novels and poetry, the typography emphasizes the meaning, the formatting can do this for the code.
Sometimes the why of decisions cannot be expressed in code this is where documentation comes in. If you have a need to document the what or the how, refactor your code. The importance of communication cannot be understated. Often talking with the user first can save you days and weeks of coding. How?

KISS and YAGNI

Simple and easy. As computer scientist we are trained to solve a problem correctly in all its glory. Every corner case is handled and secured by a test. I think to be more successful as a developer we have to unlearn this. (don’t get me wrong: there are places and software where we need this but the majority of software doesn’t need this) How often did I implement a complete version of a solution to later see that only 80% of it was used. The remaining 20% occur once in a lifetime but needed the most of the development time. If we constraint the problem we can save magnitudes of development time and can invest in more user value. And in…

Developer happiness

Some developers trade development productivity for runtime performance. But we don’t want to code in assembler or C. I happily trade runtime performance for productivity. Productivity means happiness. I am happy when I get things done. If I need to improve the performance I can focus on the parts which need it (remember YAGNI?). Programming languages, frameworks and platforms are not just tools, they are and form our ways of thinking. Tools we use have to help us reaching our goals, so…

Testing is important but just a tool

Testing gives me confidence to change code without breaking things. It helps me to avoid regression. Tests are a great tool. But only a means to an end. The program code is more important than tests, so tests should not force me to make compromises in clarity or communication level of the code. Ideally testing environments should be easy to set up and execute. One thing the recent TDD debate showed me is that we need to focus on the goals we have and the problems we want to solve with our tools. The tools we use should take a front row seat. If they need more focus or effort than the benefit they bring something is wrong. We need to constantly assess if the tools help us to reach our goals. So we have to…

Reflect

The last principle of the agile manifesto is to reflect regularly. What can be done better? How to improve? What did we learn? Often this sounds like a chore. Many times the reflection is omitted. Clean code tries to prevent this with daily reflection. But the set of practices and principles is questionable and carved into stone. So what can be done to make reflection more attractive? One of my suggestions is to keep a

Developer handbook

Reading Small Talk Best Practice Patterns (by Kent Beck) I could not help but have a feeling that this is like a personal developer handbook. In it Kent touches many important aspects of programming like composition of methods, naming and formatting. On top of that he describes his personal experiences and opinions in many of the patterns. I found this really valuable. I think starting with some of these patterns and my own experiences it would be helpful to record them in a personal developer handbook. I can use this book to remember past solutions and reflect on them. I can add my experiences in different projects and contexts to these. The goal is to get better at writing clear code and improve the communication of my code. These records could also help me to make my common habits, patterns, mistakes and approaches to problems explicit and learn from them. This is a personal book but it could also be a team effort or help others. The last part, the conclusion, of the TDD debate has a special insight for me: I should not lean on the masters of software development to advance our field and my development skills in particular. I need to find my own way and many things I take for granted I have to…

Re-think

Kent told an anecdote how he discovered TDD and then it struck me: he had a goal (or a need) in mind and had to find his way. In hindsight it sounds easy but it takes courage and persistence to push through. I think many more problems currently exist in developing software and often we took them for a given, we adjusted us, we accepted them. We cannot imagine a solution for these problems because like the elephant with the rope we never experienced a time without them. But this needs to change. We have to rethink the unsolved or inadequately solved problems and ignore some conventions and habits we formed as developers and as an industry. We have to rethink some of our approaches and assumptions. I believe we can find new ways which improve how we develop software and for this we need to rethink.

Developer power tools: Big O = big impact

Scalability and performance are often used interchangeable but they are very different. The big O notation helps in talking about scalability.

Scalability and performance are often used interchangeable but they are very different. The resource consumption of a program is its performance. So performance looks at a specific part of a program, say the user activating an action and tries to fix the input and circumstances.
Scalability defines how the resource consumption of a program changes when additional resources are added or the input size increases. So a program can have good performance (it runs fast) but when the load increases it can behavely badly (a bad scalability). Take bubble sort for example. Since the algorithm has low overhead it runs fast with small arrays but when the arrays get bigger the runtime gets worse. It doesn’t scale.
There are many ways to improve scalability here we look at one particular: reducing algorithmic complexity. The time complexity of an algorithm is measured with the big T notation. The T notation describes the time behavior of the algorithm when the input size changes. Say you have an algorithm that takes n objects and needs 4 times as long with an overhead of 2. This would be:

T(n) = 4n + 2

Often the correct numbers are not necessary, you just want to have a magnitude. This is where the big O notation comes into play. It describes the asymptotical upper bound or just the behaviour of the fastest growing part. In the above case this would be:

T(n) = 4n + 2 = O(n)

We call such an algorithm linear because the resource consumption grows linear with the increase in input size. Why does this matter? Let’s look at an example.

Say we have a list of numbers and want to know the highest. How long would this take? A straight forward implementation would look like this:

int max = Integer.MIN_VALUE;
for (int number : list) {
  if (number > max) {
    max = number;
  }
}

So we need if the size of the number list is n we need n compare operations. So our algorithm is linear. What if we have two lists of length n? We just our algorithm twice and compare both maximums so it would be

T(n) = 2n + 1 = O(n)

also linear.
Why does this all matter? If our algorithm runs quadratic, so O(n^2) and our input size is n = 1000, we need a multitude of 1 million operations. If it is linear it just needs a multitude of 1000 operations, much less. Reducing the complexity of an algorithm can be business critical or the difference between getting an instant result or losing a customer because he waited too long.
Time complexity isn’t the only complexity you can measure other resources like space can also be measured with big O. I hope this clears some questions and if you have further ones please feel free and leave a comment.

Gigapixel images in pure Java

What you can do when you hit an ancient limitation of Java while working with gigapixel sized images.

Not long ago, I read this nice little blog entry about the basic properties and usages of Java arrays. It’s a long time since I last used an array in Java myself, because my programming style evolved to heavily leverage the power of collections (and Iterables in particular, the Java 5 poor man’s substitute for Java 8 Streams). But I immediately noticed that one important fact was missing from the array blog entry:

The maximum length of an array in Java is Integer.MAX_VALUE or ((2^32)-1), aka 2.147.483.647

This is indirectly specified in the Java language specification, chapter 10.4 Array Access:

Arrays must be indexed by int values.

This little fact crossed my path when writing a little tool in pure Java that operated on large numbers of large images, combining them to a gigantic image. The customer used the tool to create images that had a size of about 100 MB, but took several hours to print because the decompression tax kicked in. One day, he reported a strange bug:

“Oh, a negative array size, what a strange bug to appear in a tested application” was my first thought. Only after reading the stacktrace more carefully did it dawn on me: The array size wasn’t negative, it was just bigger than Integer.MAX_VALUE and got wrapped around into the negative numbers. And sure enough, 72350 times 44914 is a respectable 3.249.527.900 pixels, more than 1,5 times as much as an array in Java can hold. This image was right in the multi-gigapixel range where all kinds of technical obstacles appear. The maximum length of an array in Java was mine.

Trying to stay pure

One cornerstone of the tool was being lightweight. It shouldn’t carry around unnecessary luggage and weighted around 200 kB when the bug appeared – enough to just copy it into the data directories instead of pulling the directories into the program. But when I examined the root cause of the problem at hand, I found the frustrating truth that Java’s built-in imaging library also relies on one cornerstone: all data is stored in one array. And this array can only hold around 2G entries of data.

My approach was to “partition” the full image into smaller parts that only stored a fraction of the overall pixels. To hide this fact from the ImageIO that ultimatively writes all the data into one file, my PartitionedImage implements RenderedImage and has to translate every call into a series of appropriate subcalls to the partition images. Before we look at some code, let me show you the limitations of this approach:

Greedy JPEGs, credulous PNGs

In the RenderedImage interface, there are two methods that can be used to obtain pixel data:

Raster getData(): Returns the image as one large tile (for tile based images this will require fetching the whole image and copying the image data over).
Raster getData(Rectangle rect): Computes and returns an arbitrary region of the RenderedImage.

If an image writer calls the first method, my code is screwed. There is no mentally sane way to construct a Raster instance without colliding with the array length limitation. Unfortunately, the JPEG writer does just that: He gets greedy and demands all the pixels at once. I found it easier to avoid the JPEG format and therefore trade disk space for pragmatism.

The PNG writer uses the getData(Rectangle) method to obtain the pixel data. It calls the whole image line by line: the region has always the full width of the image, but is only one pixel in height. So I guess my tool will write a lot of large PNG images in the future.

Our partitions should adapt to this behaviour by always retaining the full width of the original image and only allowing enough height that the amount of pixels per partition doesn’t exceed Integer.MAX_VALUE.

The remaining trick is to implement an AdjustingRaster that knows the original Raster of the partition and translates the row asked by the PNG writer to the according row in the partition image. The AdjustingRaster needs to know about the vertical offset. The only pitfall here is that the vertical offset has to be zero while the AdjustingRaster gets written to and needs to be set once it switches into read mode.

Slow, but working

By composing a gigapixel image from several partitions (sometimes called tiles) you can circumnavigate the frustrating limitation of Java’s arrays (I mean, it’s 2014 and 64-bit systems are somewhat prevailing now. No need to stick to 32-bit limits without a good reason). The result isn’t overwhelmingly fast, but I suspect that’s caused by the PNG image writer more than by our indirections. And we shouldn’t forget that it’s a lot of pixels to write after all.

Conclusion

Sometimes when you explore bigger and bigger use cases, you hit some arbitrary limitation. And some are fundamental ones. In our case here, we’ve reached the limit of Java arrays and got stuck because the image library in Java never heard of real gigapixel imaging and coupled itself hard to the array limit. By introducing another indirection layer on top of the image library implementation and using composition to emulate a bigger image than we actually could create, we can convince non-sceptical image writers to save all those pixels for us and even manipulate the image beforehand.

What was your approach for gigapixel image processing? How did it work out in the long run? Share your story in the comments, please.

Introducing ng-quiz – a JavaScript / Angular quiz: #1 Letter Crush

Learning a new language or framework can be quite daunting. It helps me when I have small tasks which motivate me to explore my chosen field further and in a fun and goal driven way. So in the spirit of the Ruby quiz I introduce ng-quiz, a quiz for learning Angularjs. Every month I post a small task which should be solved in a short time and helps you to learn and deepen your Angularjs and JavaScript skills. It will touch different areas and explores other JavaScript libraries. You can post your solutions (as a link to a jsfiddle or github) as a comment. Two weeks later I will update the post with the solutions.

ng-quiz #1: Letter Crush

In the first quiz you will implement a simple game named ‘Letter Crush’. In ‘Letter Crush’ a player tries to find letters forming a word in a random letter ‘soup’. The board consists of nxn quadratic cells containing a random letter.

A	G	H	M	I
T	C	O	U	E
F	E	P	R	Q
K	D	S	A	N
V	L	F	T	Y

The player can enter a word. This word must be in an english dictionary and needs to be found without overlapping. If both criteria are satisified, the word is removed from the board, the letters above the empty cells fall down and the top most empty cells are filled with new random letters.

Example

A	G	H	M	I
T	C	O	U	E
F	E	P	R	Q
K	D	S	A	N
V	L	F	T	Y

If the player enters the word ‘test’ (case insensitive) and since it is a real word the letters are removed from the board.

A	G	H	M	I
 	C	O	U	E
F	 	P	R	Q
K	D	 	A	N
V	L	F	 	Y

Now the letters above fall down.

 	 	 	 	I
A	G	H	M	E
F	C	O	U	Q
K	D	P	R	N
V	L	F	A	Y

The top most empty cells are filled again.

I	W	S	T	I
A	G	H	M	E
F	C	O	U	Q
K	D	P	R	N
V	L	F	A	Y

Now the next turn begins and the player could enter RUN, ACORN, MOURN, THORN or GO or others.

Letter Generation

To not fill the board with garbage letters, the new letters need to be generated taking the relative frequency of letters in the english language.

A	B	C	D	E	F	G
8.167	1.492	2.782	4.253	12.702	2.228	2.015
H	I	J	K	L	M	N
6.094	6.966	0.153	0.772	4.025	2.406	6.749
O	P	Q	R	S	T	U
7.507	1.929	0.095	5.987	6.327	9.056	2.758
V	W	X	Y	Z
0.978	2.360	0.150	1.974	0.074

Words

Entering a word could be done via the mouse or the keyboard. To form a valid word the letters need to be direct or diagonal adjacent. Every letter can be used only once.

Score

The player starts with a score of 0. A wrong input reduces the score by 1. A valid word is scored by its length and according to the following fibonacci sequence:

Length	1	2	3	4	5	6	7	…
Score	1	1	2	3	5	8	13	…

Optional fun: special fields

The above version of ‘Letter Crush’ is playable but if the task is too small for you or you want to explore it further, you can implement some special fields.

Duplication – lower case letter

A duplication field consists of a lower case letter and doubles the score of the word. It has a frequency of 5 percent.

Wildcard – ?

A wild card can be used as any letter and is represented as a question mark. It should be generated with 0.5 percent.

Extension – +

An extension field just lengthen a word if it is adjacent to the last letter. A plus sign marks the field and is generated with a relative frequency of 1 percent.

Bomb – *

An asterisk shows a bomb and is produced every 0.25 percent. A bomb detonates when the chosen word is adjacent to it. Every letter which is not part of the word but adjacent to the bomb is also removed.

Example:

B	M	A	S
O	R	B	*
N	B	T	B
E	U	O	L

The player chooses the word bomb and the bomb detonates. So after the removal but before the filling it looks like:


	R		
N	B		
E	U	O	L

P.S. if you are new in JavaScript and coming from a Java background you might find our JavaScript for Java developers post helpful.

JavaScript for Java developers

Although JavaScript and Java sound and look similar they are very different in their details and philosophies. Here I try to compare the two languages regardless of their libraries and frameworks. The goal is that you as a Java developer get an understanding of what JavaScript is and how it differs from Java. One hint: you can use jsfiddle.net to try out some of the snippets here or any JavaScript.
Note: right now this document discusses JavaScript 1.4, if enough interest is there I try to update it to a newer version (preferable ES5).

Primitives

Java – char, boolean, byte, short, int, long, float, double
JavaScript – none

Primitives are elements of the language which aren’t objects and therefore have no methods defined on them. JavaScript has no primitives.

Immutable types

Java – String (16bit), Character, Boolean, Byte, Short, Integer, Long, Float, Double, BigDecimal, BigInteger
JavaScript – String (16bit), Number(double, 64bit floating point), Boolean, RegExp

The next special kind of object are immutable objects, objects which represent values and cannot be changed.
JavaScript has four value objects: String (16bit like in Java), Number (64bit floating point like a double in Java), Boolean (like in Java) and RegExp (similar to Java). Java differences the number types further and introduces a Character.
Strings in JavaScript can be in single or double quotes and the sign to escape is the backslash (‘\’) just like in Java.
A regexp can be created via new RegExp or with ‘/’ like:

/a*/

Arrays

Java – special
JavaScript – normal object

Another base type in every language is the array. In Java the array is treated as a special kind of object it has a length property and is the only object which has the bracket ‘[]’ operator. In Java you create and access an array in the following way:

// creation
String[] empty = new String[2]; // an empty array with length 2
String[] array = new String[] {"1", "2"};

// read
empty[0]; // => null
empty[5]; // => ArrayIndexOutOfBoundsException

// write
empty[0] = "Test"; // empty is now ["Test", null]
empty[2] = "Test";  // => ArrayIndexOutOfBoundsException

JavaScript handles creation and access in a different way:

// creation
var empty = new Array(2); // an empty array with length 2
var array = ["1", "2"];

// read
empty[0]; // => undefined
empty[5]; // => undefined

// write
empty[0] = "Test"; // empty is now ["Test", undefined]
empty[2] = "Test"; // empty is now ["Test", undefined, "Test"]

The reason for the strange patterns is that an array in JavaScript is just an object with the indexes as properties and reading an undefined property returns undefined whereas setting an undefined property creates the property on the object. More on this under objects.

Comments

Java – // and /**/
JavaScript – // and /**/

Both languages allow the line ‘//’ and the block ‘/* */’ comments whereas the line comment is preferred in JavaScript because commenting out a regular expression can lead to syntax errors:

/a*/

Commenting out this regular expression with the block comment would result in

/* /a*/ */

which is a syntax error.

Boolean Truth

Java – true: true, false: false
JavaScript – false: false, null, undefined, ”, 0, NaN, true: all other values

Another stumbling block for Java developers is the handling of expressions in a boolean context. JavaScript not just treats false as false but also defines null, undefined, the empty string, 0, NaN as falsy values. All other values are evaluated to true.

Literals

Java – “, ‘, numbers, booleans
JavaScript – “, ‘, [], {}, /, numbers, booleans

Literals are a short hand for constructing objects inside the language. Java only supports string, number and boolean creation with literals everything else needs a new operator. In JavaScript you can create strings, numbers, booleans, arrays, objects and regular expressions:

"A string";
'Another string';
var number = 5;
var whatif = true;
var array = [];
var object = {};
var regexp = /a*b+/;

Operators

Java – postfix (expr++ expr–), unary (++expr –expr +expr -expr ~ !), multiplicative (* / %), additive (+ -), shift (<> >>>), relational ( = instanceof), equality (== !=), bitwise AND (&), bitwise exclusive OR (^),, bitwise inclusive OR (|), logical AND (&&), logical OR (||), ternary (?:), assignment (= += -= *= /= %= &= ^= |= <>= >>>=)
JavaScript – object creation (new), function call (()), increment/decrement (++ –), unary (+expr -expr ~ !), typeof, void, delete, multiplicative (* / %), additive (+ -), shift (<> >>>), relational ( = in instanceof), equality (== != === !==), bitwise AND (&), bitwise exclusive OR (^),, bitwise inclusive OR (|), logical AND (&&), logical OR (||), ternary (?:), assignment (= += -= *= /= %= &= ^= |= <>= >>>=)

Java and JavaScript have many operators in common. JavaScript has some additional ones. ‘void’ is an operator to return undefined and rarely useful. ‘delete’ removes properties from objects and hence also elements from arrays. ‘in’ tests for a property of an object but does not work for literal strings and numbers.

var string = "A string";
"length" in string // => error
var another = new String('Another string');
"length" in another // => true

The unary operators ‘+’ and ‘-‘ try to convert their operands to numbers and if the conversion fails they return NaN:

+'5' // => 5
-'2' // => 2
-'a' // => NaN

Typeof returns the type of its operand as a string. Beware the difference between literal creation and creation via new for numbers and strings.

typeof undefined // => "undefined"
typeof null // => "object"
typeof true // => "boolean"
typeof 5 // => "number"
typeof new Number(5) // => "object"
typeof 'a' // => "string"
typeof new String('a') // => "object"
typeof document // => Implementation-dependent
typeof function() {} // => "function"
typeof {} // => "object"
typeof [] // => "object"

All host environment specific objects like window or the html elements in a browser have implementation dependent return values.
Note that for an array it also returns “object” if you need to distinguish an array you must dig deeper.

Object.prototype.toString.call([]) // => "[object Array]"

The two pairs of equality operators (== != and === !==) behave differently. The shorter ones ‘==’ and ‘!=’ use type coercion which produces strange results and breaks transitivity:

'' == '0' // => false
0 == '' // => true
0 == '0' // => true

‘===’ and ‘!==’ works as expected if both operands are of the same type and have the same value they are true. The same value means either they are the same object or if they are a literal string, a literal number or a literal boolean have the same value regardless of length or precision.

5 === 5 // => true
5 === 5.0 // => true
'a' === "a" // => true
5 === '5' // => false
[5] === [5] // => false
new Number(5) === new Number(5) // => false
var a = new Number(5);
a === a  // => true
false === false // => true

Declaration

Java – type
JavaScript – var

Since JavaScript is a dynamically typed language you do not specify types when declaring parameters, fields or local variables you just use var:

var a = new Number(5);

Scope

Java – block
JavaScript – function

Scope is a common pitfall in JavaScript. Scope defines the code area in which a variable is valid and defined. Java has block scope which means a variable is defined and valid inside any block.

int a = 2;
int b = 1;
if (a > b) {
	int number = 5;
}
// no number defined here

JavaScript on the other hand has function scope which can lead to some confusion for developers coming from block scoped languages.

var f = function() {
  var a = 2;
  var b = 1;
  if (a > b) {
	var number = 5;
  }
  alert(number); // number is valid here
};
// but not here

One thing to remember is that closures have a reference not a copy of their variables from an outer scope.

for (var i = 0; i < 3; i++) {
  setTimeout(function() {
    i; // => always 3
  }, 200);
}

How can you fix this? You need to add a wrapper function and pass the values you need.

for (var i = 0; i < 3; i++) {
  (function(i) {
    setTimeout(function() {
      i; // => 0, 1, 2
    }, 200);
  })(i);
}

Statements

Java – conditional (switch, if/else), loop (while, do/while, for), branch (return, break, continue), exception (throw, try/catch/finally)
JavaScript – conditional (switch (uses ===), if/else), loop (while, do/while, for, for in (beware of protoype chain)), branch (break, continue, return), exception (throw, try/catch/finally), with

The statements which can be used in Java and JavaScript are largely the same but since JavaScript is dynamically typed you can use them with any types. See the section about boolean truth for the statements which need an expression to evaluate to false or true. Switch uses the ‘===’ operator to match the cases and has the same fall through pitfall like Java. ‘For in’ iterates over the names of all properties of an object including those which are inherited via the prototype chain. ‘With’ can be used to shorten the access to objects.

with (object) {
  a = b
}

The problem here is you don’t know from looking at the code if a and/or b is a property of object or a global variable. Because of this ambiguity ‘with’ should be avoided

Object creation

Java – new
JavaScript – new or functional creation / module pattern

In Java you just declare your class

public class Person {
  private final String name;
  
  public Person(String name) {
    this.name = name;
  }
  
  public String getName() {
    return this.name;
  }
}

and instantiate it via new.

Person john = new Person("John");

In JavaScript there is no class keyword but you can create objects via ‘{}’ or ‘new’. Let’s take a look at the functional approach first. The so called module pattern supports encapsulation (read: private members).

var person = function(name) {
  var private_name = name;
  return {
    get_name: function() {
      return private_name;
    }
  };
};

Now person holds a reference to a factory method and calling it will create a new person.

var john = person('John');

Another more classical and familiar way is to use ‘new’.

var Person = function(name) {
  this.name = name;
};

Person.prototype.get_name = function() {
  return this.name;
};

var john = new Person('John');

But what happens when we leave out the new?

var john = Person('John'); // bad idea!

Now this is bound to window (the global context) and a name property is defined on window but we can avoid this:

var Person = function(name) {
  if (!(this instanceof Person)) {
    return new Person(name);
  }
  this.name = name;
};

Now you can call Person with or without new and both behave the same. If you don’t want to repeat this for every class you can use the following pattern (adapted from John Resig to make it ES5 strict compatible).

// adapted from makeClass - By John Resig (MIT Licensed) - http://ejohn.org/blog/simple-class-instantiation/
var makeClass = function() {
  var internal = false;
  var create = function(args) {
    if (this instanceof create) {
      if (typeof this.init == "function") {
        this.init.apply(this, internal ? args : arguments);
      }
    } else {
      internal = true;
      return new create(arguments);
    }
  };
  return create;
};

This creates a function which can create classes. You can use it similar to the classical pattern.

var Person = makeClass();
Person.prototype.init = function(name) {
  this.name = name;
};
Person.prototype.get_name = function() {
  return this.name;
};

var john = new Person('John');

But name is now a public member of Person what if we want it to be private? If we take another look at the functional pattern above we can use the same mechanism.

var Person = function(name) {
  if (!(this instanceof Person)) {
    return new Person(name);
  }
  var private_name = name;
  this.get_name = function() {
    return private_name;
  };
  this.set_name = function(new_name) {
    private_name = new_name;
  };
};

Now name is also a private member of the Person class. Using makeClass you can achieve it in the following way.

var Person = makeClass();
Person.prototype.init = function(name) {
  var private_name = name;
  this.get_name = function() {
    return private_name;
  };
};

var john = new Person('John');

Encapsulation

Java – visibility modifiers (public, package, protected, private)
JavaScript – public or private (via closures)

As we have seen in the previous section we can have private variables and also methods via the encapsulation of a closure. All other variables and members are public.

Accessing properties

Java – .
JavaScript – . or []

Besides the dot you can also use an object like a hash.

var a = {b: 1};
a.b = 3;
a['b'] = 5;

Accessing non existing properties

Java – prevented by the compiler
JavaScript – get returns undefined, set creates

In Java accessing a property or method of an object which does not exists is prevented by the compiler. In JavaScript the following compiles and runs fine.

var a = {};
a.b;
a.b = 5;

When you access non existing members of an object you get undefined in return. Setting the non existing property creates it on the object.

Invocation and this

Java – method
JavaScript – method, function, constructor, apply

JavaScript knows four kinds of invocations: method, function, constructor and apply. A function on an object is called method and calling it will bound this to the object.

var john = {
  name: "John",
  get_name: function() {
    return this.name; // => this is bound to john
  }
};
john.get_name(); // => John

But there is a potential pitfall: it doesn’t matter which method you call but how! This problem can be worked around with the apply/call pattern below.

var john = {
  name: "John",
  get_name: function() {
    return this.name; // => this is bound to the global context
  }
};
var fn = john.get_name;
fn(); // => NOT John

A function which is not a property of an object is just a function and this is bound to the global context (in a browser the global context is the window object).

var get_name = function() {
  return this.name; // this is bound to the global context
};
get_name();

Calling a function with ‘new’ constructs a new object and bounds this to it.

var Person = function(name) {
  this.name = name; // => this is bound to john
};
var john = new Person("John");
john.name; // => John

JavaScript is a functional language (some call it even Lisp in C’s clothing) and therefore functions have methods, too. ‘Apply’ and ‘call’ are both methods to call a function with binding ‘this’ explicit.

var john = {
  name: "John"
};
var get_name = function() {
  return this.name; // this is bound to the john
};
get_name.apply(john); // => John
get_name.call(john); // => John

The difference between ‘apply’ and ‘call’ is just how they take their additional parameters: ‘apply’ needs an array whereas ‘call’ takes them explicitly.

var john = {
  name: "John"
};
var set_name = function(name) {
  this.name = name; // this is bound to the john
};
set_name.apply(john, ["Jack"]); // => Jack
set_name.call(john, "John"); // => John

Variable arguments

Java – …
JavaScript – arguments

In Java you can use variable argument lists via ‘…’. In JavaScript you do not need to declare them. All parameters of a function call are available via arguments regardless of what parameters are declared.

var sum = function() {
  var result = 0;
  for (var i = 0; i < arguments.length; i++) {
    result += arguments[i];
  }
  return result;
};
sum(1); // => 1
sum(1, 2); // => 3

Also arguments looks like an array it isn’t one and if you need an array of arguments you can use slice to convert it.

var array = Array().slice.call(arguments);

Inheritance

Java – extends, implements
JavaScript – prototype chain

Java can easily inherit types or implementation via implements or extends. JavaScript has no classes and uses another approach called the prototype chain. If you want to create a new object User which inherits from Person you use the prototype attribute.

var Person = function(name) {
  this.name = name;
};

var User = function(username) {
  Person.call(this, username); // emulating call to super
  this.username = username;
};

User.prototype = new Person();

var john = new User('John');
john.name; // => John
john.username; // => John

If I left something out or got something wrong please leave a comment. Also if you think a topic discussed here should be explored in more depth feel free to comment.

The difference between Test First and Test Driven Development

If you tend to fall into the “one-two-everything”-trap while doing TDD, this blog post might give you some perspective on what really happens and how to avoid the trap by decomposing the problem.

The concept of Test First (“TF”, write a failing test first and make it green by writing exactly enough production code to do so) was always very appealing to me. But I’ve never experienced the guiding effect that is described for Test Driven Development (“TDD”, apply a Test First approach to a problem using baby steps, letting the tests drive the production code). This lead to quite some frustration and scepticism on my side. After a lot of attempts and training sessions with experienced TDD practioners, I concluded that while I grasped Test First and could apply it to everyday tasks, I wouldn’t be able to incorporate TDD into my process toolbox. My biggest grievance was that I couldn’t even tell why TDD failed for me.

The bad news is that TDD still lies outside my normal toolbox. The good news is that I can pinpoint a specific area where I need training in order to learn TDD properly. This blog post is the story about my revelation. I hope that you can gather some ideas for your own progress, implied that you’re no TDD master, too.

A simple training session

In order to learn TDD, I always look for fitting problems to apply it to. While developing a repository difference tracker, the Diffibrillator, there was a neat little task to order the entries of several lists of commits into a single, chronologically ordered list. I delayed the implementation of the needed algorithm for a TDD session in a relaxed environment. My mind began to spawn background processes about possible solutions. When I finally had a chance to start my session, one solution had already crystallized in my imagination:

An elegant solution

Given several input lists of commits, now used as queues, and one result list that is initially empty, repeat the following step until no more commits are pending in any input queue: Compare the head commits of all input queues by their commit date and remove the oldest one, adding it to the result list.
I nick-named this approach the “PEZ algorithm” because each commit list acts like the old PEZ candy dispensers of my childhood, always giving out the topmost sherbet shred when asked for.

A Test First approach

Trying to break the problem down into baby-stepped unit tests, I fell into the “one-two-everything”-trap once again. See for yourself what tests I wrote:

@Test
public void emptyIteratorWhenNoBranchesGiven() throws Exception {
  Iterable<ProjectBranch> noBranches = new EmptyIterable<>();
  Iterable<Commit> commits = CombineCommits.from(noBranches).byCommitDate();
  assertThat(commits, is(emptyIterable()));
}

The first test only prepares the classes’ interface, naming the methods and trying to establish a fluent coding style.

@Test
public void commitsOfBranchIfOnlyOneGiven() throws Exception {
  final Commit firstCommit = commitAt(10L);
  final Commit secondCommit = commitAt(20L);
  final ProjectBranch branch = branchWith(secondCommit, firstCommit);
  Iterable<Commit> commits = CombineCommits.from(branch).byCommitDate();
  assertThat(commits, contains(secondCommit, firstCommit));
}

The second test was the inevitable “simple and dumb” starting point for a journey led by the tests (hopefully). It didn’t lead to any meaningful production code. Obviously, a bigger test scenario was needed:

@Test
public void commitsOfSeveralBranchesInChronologicalOrder() throws Exception {
  final Commit commitA_1 = commitAt(10L);
  final Commit commitB_2 = commitAt(20L);
  final Commit commitA_3 = commitAt(30L);
  final Commit commitA_4 = commitAt(40L);
  final Commit commitB_5 = commitAt(50L);
  final Commit commitA_6 = commitAt(60L);
  final ProjectBranch branchA = branchWith(commitA_6, commitA_4, commitA_3, commitA_1);
  final ProjectBranch branchB = branchWith(commitB_5, commitB_2);
  Iterable<Commit> commits = CombineCommits.from(branchA, branchB).byCommitDate();
  assertThat(commits, contains(commitA_6, commitB_5, commitA_4, commitA_3, commitB_2, commitA_1));
}

Now we are talking! If you give the CombineCommits class two branches with intertwined commit dates, the result will be a chronologically ordered collection. The only problem with this test? It needed the complete 100 lines of algorithm code to be green again. There it is: the “one-two-everything”-trap. The first two tests are merely finger exercises that don’t assert very much. Usually the third test is the last one to be written for a long time, because it requires a lot of work on the production side of code. After this test, the implementation is mostly completed, with 130 lines of production code and a line coverage of nearly 98%. There wasn’t much guidance from the tests, it was more of a “holding back until a test allows for the whole thing to be written”. Emotionally, the tests only hindered me from jotting down the algorithm I already envisioned and when I finally got permission to “show off”, I dived into the production code and only returned when the whole thing was finished. A lot of ego filled in the lines, but I didn’t realize it right away.

But wait, there is a detail left out from the test above that needs to be explicitely specified: If two commmits happen at the same time, there should be a defined behaviour for the combiner. I declare that the order of the input queues is used as a secondary ordering criterium:

@Test
public void decidesForFirstBranchIfCommitsAtSameDate() throws Exception {
  final Commit commitA_1 = commitAt(10L);
  final Commit commitB_2 = commitAt(10L);
  final Commit commitA_3 = commitAt(20L);
  final ProjectBranch branchA = branchWith(commitA_3, commitA_1);
  final ProjectBranch branchB = branchWith(commitB_2);
  Iterable<Commit> commits = CombineCommits.from(branchA, branchB).byCommitDate();
  assertThat(commits, contains(commitA_3, commitA_1, commitB_2));
}

This test didn’t improve the line coverage and was green right from the start, because the implementation already acted as required. There was no guidance in this test, only assurance.

And that was my session: The four unit tests cover the anticipated algorithm completely, but didn’t provide any guidance that I could grasp. I was very disappointed, because the “one-two-everything”-trap is a well-known anti-pattern for my TDD experiences and I still fell right into it.

A second approach using TDD

I decided to remove my code again and pair with my co-worker Jens, who formulated a theory about finding the next test by only changing one facet of the problem for each new test. Sounds interesting? It is! Let’s see where it got us:

@Test
public void noBranchesResultsInEmptyTrail() throws Exception {
  CommitCombiner combiner = new CommitCombiner();
  Iterable<Commit> trail = combiner.getTrail();
  assertThat(trail, is(emptyIterable()));
}

The first test starts as no big surprise, it only sets “the mood”. Notice how we decided to keep the CommitCombiner class simple and plain in its interface as long as the tests don’t get cumbersome.

@Test
public void emptyBranchesResultInEmptyTrail() throws Exception {
  ProjectBranch branchA = branchFor();
  CommitCombiner combiner = new CommitCombiner(branchA);
  assertThat(combiner.getTrail(), is(emptyIterable()));
}

The second test asserts only one thing more than the initial test: If the combiner is given empty commit queues (“branches”) instead of none like in the first test, it still returns an empty result collection (the commit “trail”).

With the single-facet approach, we can only change our tested scenario in one “domain dimension” and only the smallest possible amount of it. So we formulate a test that still uses one branch only, but with one commit in it:

@Test
public void branchWithCommitResultsInEqualTrail() throws Exception {
  Commit commitA1 = commitAt(10L);
  ProjectBranch branchA = branchFor(commitA1);
  CommitCombiner combiner = new CommitCombiner(branchA);
  assertThat(combiner.getTrail(), Matchers.contains(commitA1));
}

With this test, there was the first meaningful appearance of production code. We kept it very simple and trusted our future tests to guide the way to a more complex version.

The next test introduces the central piece of domain knowledge to the production code, just by changing the amount of commits on the only given branch from “one” to “many” (three):

@Test
public void branchWithCommitsAreReturnedInOrder() throws Exception {
  Commit commitA1 = commitAt(10L);
  Commit commitA2 = commitAt(20L);
  Commit commitA3 = commitAt(30L);
  ProjectBranch branchA = branchFor(commitA3, commitA2, commitA1);
  CommitCombiner combiner = new CommitCombiner(branchA);
  assertThat(combiner.getTrail(), Matchers.contains(commitA3, commitA2, commitA1));
}

Notice how this requires the production code to come up with the notion of comparable commit dates that needs to be ordered. We haven’t even introduced a second branch into the scenario yet but are already asserting that the topmost mission critical functionality works: commit ordering.

Now we need to advance to another requirement: The ability to combine branches. But whatever we develop in the future, it can never break the most important aspect of our implementation.

@Test
public void twoBranchesWithOnlyOneCommit() throws Exception {
  Commit commitA1 = commitAt(10L);
  ProjectBranch branchA = branchFor(commitA1);
  ProjectBranch branchB = branchFor();
  CommitCombiner combiner = new CommitCombiner(branchA, branchB);
  assertThat(combiner.getTrail(), Matchers.contains(commitA1));
}

You might say that we knew about this behaviour of the production code before, when we added the test named “branchWithCommitResultsInEqualTrail”, but it really is the assurance that things don’t change just because the amount of branches changes.

Our production code had no need to advance as far as we could already anticipate, so there is the need for another test dealing with multiple branches:

@Test
public void allBranchesAreUsed() throws Exception {
  Commit commitA1 = commitAt(10L);
  ProjectBranch branchA = branchFor(commitA1);
  ProjectBranch branchB = branchFor();
  CommitCombiner combiner = new CommitCombiner(branchB, branchA);
  assertThat(combiner.getTrail(), Matchers.contains(commitA1));
}

Note that the only thing that’s different is the order in which the branches are given to the CommitCombiner. With this simple test, there needs to be some important improvements in the production code. Try it for yourself to see the effect!

Finally, it is time to formulate a test that brings the two facets of our algorithm together. We tested the facets separately for so long now that this test feels like the first “real” test, asserting a “real” use case:

@Test
public void twoBranchesWithOneCommitEach() throws Exception {
  Commit commitA1 = commitAt(10L);
  Commit commitB1 = commitAt(20L);
  ProjectBranch branchA = branchFor(commitA1);
  ProjectBranch branchB = branchFor(commitB1);
  CommitCombiner combiner = new CommitCombiner(branchA, branchB);
  assertThat(combiner.getTrail(), Matchers.contains(commitB1, commitA1));
}

If you compare this “full” test case to the third test case in my first approach, you’ll see that it lacks all the mingled complexity of the first try. The test can be clear and concise in its scenario because it can rely on the assurances of the previous tests. The third test in the first approach couldn’t rely on any meaningful single-faceted “support” test. That’s the main difference! This is my error in the first approach: Trying to cramp more than one new facet in the next test, even putting all required facets in there at once. No wonder that the production code needed “everything” when the test requires it. No wonder there’s no guidance from the tests when I wanted to reach all my goals at once. Decomposing the problem at hand into independent “features” or facets is the most essential step to learn in order to advance from Test First to Test Driven Development. Finding a suitable “dramatic composition” for the tests is another important ability, but it can only be applied after the decomposition is done.

But wait, there is a fourth test in my first approach that needs to be tested here, too:

@Test
public void twoBranchesWithCommitsAtSameTime() throws Exception {
  Commit commitA1 = commitAt(10L);
  Commit commitB1 = commitAt(10L);
  ProjectBranch branchA = branchFor(commitA1);
  ProjectBranch branchB = branchFor(commitB1);
  CommitCombiner combiner = new CommitCombiner(branchA, branchB);
  assertThat(combiner.getTrail(), Matchers.contains(commitA1, commitB1));
}

Thankfully, the implementation already provided this feature. We are done! And in this moment, my ego showed up again: “That implementation is an insult to my developer honour!” I shouted. Keep in mind that I just threw away a beautiful 130-lines piece of algorithm for this alternate implementation:

public class CommitCombiner {
  private final ProjectBranch[] branches;

  public CommitCombiner(ProjectBranch... branches) {
    this.branches = branches;
  }

  public Iterable<Commit> getTrail() {
    final List<Commit> result = new ArrayList<>();
    for (ProjectBranch each : this.branches) {
      CollectionUtil.addAll(result, each.commits());
    }
    return sortedWithBranchOrderPreserved(result);
  }

  private Iterable<Commit> sortedWithBranchOrderPreserved(List<Commit> result) {
    Collections.sort(result, antichronologically());
    return result;
  }

  private <D extends Dated> Comparator<D> antichronologically() {
    return new Comparator<D>() {
      @Override
      public int compare(D o1, D o2) {
        return o2.getDate().compareTo(o1.getDate());
      }
    };
  }
}

The final and complete second implementation, guided to by the tests, is merely six lines of active code with some boiler-plate! Well, what did I expect? TDD doesn’t lead to particularly elegant solutions, it leads to the simplest thing that could possibly work and assures you that it will work in the realm of your specification. There’s no place for the programmer’s ego between these lines and that’s a good thing.

Conclusion

Thank you for reading until here! I’ve learnt an important lesson that day (thank you, Jens!). And being able to pinpoint the main hindrance on my way to fully embracing TDD enabled me to further improve my skills even on my own. It felt like opening an ever-closed door for the first time. I hope you’ve extracted some insights from this write-up, too. Feel free to share them!

TDD myths: the problems

I take a look at some (in my experience) problems/misconceptions with TDD:
100% code coverage is enough, Debugging is not needed, Design for testability, You are faster than without tests

100% code coverage is enough

Code coverage seems to be a bad indicator for the quality of the tests. Take the following code as an example:

public void testEmptySum() {
  assertEquals(0, sum());
}

public void testSumOfMultipleNumbers() {
  assertEquals(5, sum(2, 3));
}

Now take a look at the implementation:

public int sum(int...numbers) {
  if (numbers.length == 0) {
    return 0;
  }
  return 5;
}

Baby steps in TDD could lead you to this implementation. It has 100% code coverage and all tests are green. But the implementation isn’t finished at all. Our experiment where we investigated how much tests communicate the intend of the code showed flaws in metrics like code coverage.

Debugging is not needed

One promise of TDD or tests in general is that you can neglect debugging. Even abandon it. In my experience when a test goes red (especially an integration test) you sometimes need to fire up the debugger. The debugger helps you to step through code and see the actual state of the system at that time. Tests treat code as a black box, an input results in an output. But what happens in between? How much do you want to couple your tests to your actual implementation steps? Do we need the tests to cover this aspect of software development? Maybe something along the lines as shown in Inventing on principle where the computer shows you the immediate steps your code takes could replace debugging but tests alone cannot do it.

Design for testability

A noble goal. But are tests your primary client? No. Other code is. Design for maintainability would be better. You will need to change your code, fix it, introduce new features, etc. Don’t get me wrong: You need tests and you need testability. But how much code do you write specifically for your tests? How much flexibility do you introduce because of your tests? What patterns do you use just because your tests need them? It’s like YAGNI for code exposure for tests. Code specifically written only for tests couples your code to your tests. Only things that need to be coupled should be. Is the choice of the underlying data structure important? Couple it, test it. If it isn’t, don’t expose it, don’t write a getter. Don’t break the information hiding principle if you don’t need to. If you couple your tests too much to your code every little change breaks your tests. This hinders maintenance. The important and difficult design question is: what is important. Test this.

You are faster than without tests

Some TDD practitioners claim that they are faster with TDD than without tests because the bugs and problems in your code will overwhelm you after a certain time. So with a certain level of complexity you are going faster with TDD. But where is this level? In my experience writing code without tests is 3x-4x faster than with TDD. For small applications. There are entire communities where many applications are written without or with only a few tests. But I wouldn’t write a large application without tests but at least my feeling is that in many cases I go much slower. Cases where I feel faster are specification heavy. Like parsing or writing formats, designing an algorithm or implementing a scientific formula. So the call is open on this one. What are your experiences? Do you feel slowed down by TDD?

Communication through Tests – a larger experiment

We evaluated our ability to communicate through tests in a two-day experiment and gathered some interesting results.

For us, automated tests are the hallmark of professional software development. That doesn’t mean that we buy into every testing fad that comes along or consider ourselves testing experts just because we write some tests alongside our code. We put our money where our mouth is and evaluate our abilities in writing effective tests.

One way to measure the effectiveness of tests is to try to “communicate through tests”. One developer/team writes code and tests for a given specification. Another team picks up the tests only and tries to recreate the production code and infer the specification. The only communication between the two teams happens through the tests.

We performed a small experiment with two teams and one day for both phases and blogged about it. The results of this evaluation was that unit tests are a good medium to transport specification details. But we got a hint that problems might be bigger when the code was less arithmetic and more complex. As most of our development tasks are rather complex and driven by business rules instead of clean mathematical algorithms, we wanted to inspect further.

Our larger experiment

So we organized a bigger experiment with a broader scope. Instead of two teams, we had three teams. We ran the phases for eight instead of two hours, essentially increasing the resulting code size by a factor of 3. The assignments weren’t static, but versioned – and the team only knew the rules of the current version. When a team would reach a certain milestone, more rules would be revealed, partly contradicting the previous ruleset. This should emulate changing customer requirements. And to provide the ability to retrospect on the reconstruction phase, we recorded this phase with a screencast software (we used the commercial product Debut Video Capture), capturing both inputs and conversation by using headsets for every developer.

The first part of this experiment happened in late January of 2013, where all teams had one day to produce production and test code. This was a day of loud buzz in our development department. The second part for the reconstruction phase was scheduled for the middle of February 2013. We had to be a bit more quiet this time to increase the audio recording quality, but the developers were humming nonetheless.

Here are some numbers of what was produced in the first session:

Team 1: 400 lines of production code, 530 lines of test code. 8 production classes, 54 tests. Test coverage of 90.6%.
Team 2: 576 lines of production code, 655 lines of test code. 17 production classes, 59 tests. Test coverage of 98.2%.
Team 3: 442 lines of production code, 429 lines of test code. 18 production classes, 37 tests. Test coverage of 97.0%.

The reconstruction phase was finished in less than five hours, partly because we stuck very close to the actual tests with little guesswork. When the tests didn’t enforce a functionality, it wasn’t implemented to reveal the holes in the test coverage. This reduced the amount of production code that had to be written. On the flipside, every team got lost once on the way, loosing the better part of an hour without noticeable progress.

The results

After all the talk about the event itself, let’s have a look at our results of the experiment:

The recording of the reconstruction phase was a huge gain in understanding the detailed problems. We even discussed recording the construction phase too to capture the original design decisions.
Every decision on unclear terms from the original team lead to “blurry” tests that didn’t guide the reconstruction team as good as the “razor-sharp” tests did.
You could definitely tell the TDD tests from the “test first” tests or even the tests written “immediately after”. More on this aspect later, but this was our biggest overall take-away: The quality of the tests in terms of being a specification differed greatly. This wasn’t bound to teams – as soon as a team lost the TDD “drive”, the tests lost guidance power.
Test coverage (in terms of line coverage or conditional coverage) means nothing. You can have 100% test coverage and still suffer from severe plot holes in your tests. Blurry tests tend to increase the coverage, but not the accountability of tests.
In general, we were surprised how little guidance and coverage most tests offered. The assignments included some obvious “testing problems” like dealing with randomness and every team dealt with them deliberately. Still, these were the major pain points during the reconstruction phase. This result puts our first small experiment a bit into perspective. What works well with small code bases might be disproportionally harder to achieve when the code size scales up. So while TDD/tests might work sufficiently easy on a small task, it needs more attention for a larger task.

The biggest problem

When talking about “plot holes” from the tests, let me give you a detailed example of what I mean. The more useless tests suffered from a lack of triangulation. In geometry, triangulation is the process of determining the location of a point by measuring several angles to it from known points. When writing tests, triangulation is the effort to “pinpoint” or specify the implementation with a set of different inputs and required outputs. You specify enough different tests of the same functionality to require it being “real” instead of a dummy implementation. Let’s look at this test:

@Test
public void parsesUserInput() {
  assertThat(new InputParser().parse("1 3 5"), hasItems(1, 3, 5));
}

Well, the test tells us that we need to convert a given string into a bunch of integers. It specifies the necessary class and method for this task, but gives us great freedom in the actual implementation. This makes the test green:

public Iterable<Integer> parse(String input) {
  return Arrays.asList(1, 3, 5);
}

As far as the tests are concerned, this is a concise and correct implementation of the required functionality. And while it is obvious in our example that this will never be sufficient, it oftentimes isn’t so obvious when the problem domain isn’t as familiar as parsing strings to numbers. But to complete my explanation of test triangulation, let’s consider a more elaborate implementation of this test that needs a lot more work on the implementation side (especially when developed in accordance with the Transformation Priority Premise by Uncle Bob and without obvious duplication):

@Test
public void parsesUserInput() {
  assertThat(new InputParser().parse("1 3 5"), hasItems(1, 3, 5));
  assertThat(new InputParser().parse("1 2"), hasItems(1, 2));
  assertThat(new InputParser().parse("1 2 3 4 5"), hasItems(1, 2, 3, 4, 5));
  assertThat(new InputParser().parse("1 4 5 3 2"), hasItems(1, 2, 3, 4, 5));
  assertThat(new InputParser().parse("5 4"), hasItems(4, 5));
  assertThat(new InputParser().parse("5 3"), hasItems(3, 5));
}

Maybe not all assertions are required and maybe they should live in different tests giving more hints in their names, but you get the idea: Making this test green is way “harder” than the initial test. Writing properly triangulated tests is one of the immediate benefits of Test Driven Development (TDD), as for example outlined nicely by Ray Sinnema on his blog entry about test-driving a code kata.
Our tests that were written “after the fact” often lacked the proper amount of triangulation, making it easier to “fake it” in the reconstruction phase. In a real project setting, these tests would allow for too much implementation deviation to act as a specification. They act more as usage examples and happy path “smoke” tests.

Our benefits

While this experiment doesn’t fulfill rigid academic requirements on gathering data, it already paid off greatly for us. We’ve examined our ability to express our implementations through tests and gathered insight on our real capabilities to use test-driven methodologies. Being able to judge relatively objectively on the quality of your own tests (by watching the reconstruction phase’s screencast) was very helpful. We now know better what skills to improve and what to focus on during training.

Where to go from here?

We plan to repeat this experiment with interested participants as a spare-time event later this year. For now and ourselves, we have gathered enough impressions to act on them. If you are interested in more details, drop us a note. We could publish only the tests (for reconstruction), the complete code or even the screencasts (albeit they are somewhat long-running). Our participants could elaborate their impressions in the comment section, if you ask them.
We are very interested in your results when performing similar events, like Tomasz Borek did this month in Krakow, Poland. We found his blog entry about the event to be very interesting. We definitely lacked the surprise element for the teams during the event.

TDD myths

Some TDD myths and what is true about them.

TDD, Test first or test immediately after are all the same

No. All methods result in having tests in the end. But especially in the TDD case your mind set is completely different. First the tests drive the design of your code. You construct your system piece by piece. Unit test for unit test. All code you write must have a test first and you use the tests to describe and reason about the external interface of your units. In TDD the tests represent the future clients using your code. In practice this leads to small(er) units.

In TDD the tests’ (main) objective is to prevent regression

No. Tests help immensely when you break the same code twice. But even more so tests help to structure your code and make it maintainable. When using TDD you tend to reduce your code and its flexibility because you need to write a test for every piece of functionality first. So over designing or implementing things you don’t need (breaking YAGNI or KISS) bites you doubly: in the code and in the tests. Also wrong design decisions like choosing an inappropriate data structure or representation hits you twice as hard. TDD emphasizes bad design decisions.

Test code is the same as production code

No. Test code should adhere too a similar quality level like production code. But you won’t write tests for your tests. Also conditionals and loops are a very bad idea in tests and should be avoided. Take the following example:

public void testSomething() {
  for (MyEnum value : values()) {
    assertEquals(expected, do(value))
  }
}

If you forgot an enum value the tests just passes. Even if you have no values in your enum it passes still. Conditions have the same problem: you introduce another path through your test which can be avoided or never taken. You could secure the other path through an assert but in some cases this is a hint that you broke another principle: single responsibility of tests.

DRY is harmful in tests

No. DRY (don’t repeat yourself) aims to reduce or eliminate duplication in logic. But often DRY is understood as removing code duplication. This is not the same! Code duplication can be essential in tests. You need all of the essential information in the test. This code should not be extracted or abstracted elsewhere. These code lines which may seem similar are not coupled logically. When you change one test, the other test is not affected.

TDD is hard

No and yes. For me learning TDD is like learning a new language. It certainly needs time. But if you do it often and repeatedly you learn more every time you use it. It’s a way of reasoning about a system, a way of thinking, a paradigm. When I started with TDD I thought it was impossible or unreasonable to use in cases other than where strong specs exist like parsing a format. But over time I value the driving part of TDD more and more. You can get into a TDD flow. TDD gives you a very good feeling of security when you refactor. It forces you beforehand to think about your intended use for your code. Which is good. It changes my way of seeing my code, one step at time. Some things are still hard: acceptance tests are unreasonably expensive. Just testing one thing needs discipline. Not jumping ahead of the tests and implementing too much code also. Finding the next unit of testing can be difficult, getting stuck can be frustrating. Just like learning a new language I think it is worth it.

	Anonymous on Cache configuration with WildF…
	Miq on Nested queries like N+1 in pra…
	mariuselvert on Creating functors with lambda…
	Nested queries like… on Common SQL Performance Gotchas…
	Nested queries like… on Make your users happy by not c…