Unit-Testing Deep-Equality in C#

In the suite of redux-style applications we are building in C#, we are making extensive use of value-types, which implies that a value compares as equal exactly if all of its contents are equal also known as “deep equality”, as opposed to “reference equality” or “shallow equality”. Both of those imply deep equality, but the other way around is not true. The same object is of course equal to itself, not matter how deep you look. And an object that references the same data as another object also has equal content. But a simple object that contains different lists with equal content will be unequal under shallow comparison, but equal under deep comparison.

Though init-only records already provide a per-member comparison as Equals be default, this fails for collection types such as ImmutableList<> that, against all intuition but in accordance to , only provide reference-equality. For us, this means that we have to override Equals for any value type that contains a collection. And this is were the trouble starts. Once Equals is overridden, it’s extremely easy to forget to also adapt Equals when adding a new property. Since our redux-style machinery relies on a proper “unequal”, this would manifest in the application as a sporadically missing UI update.

So we devised a testing strategy for those types, using a little bit of reflection:

  1. Create a sample instance of the value type with no member retaining its default value
  2. Test, by going over all properties and comparing to the same property in a default instance, if indeed all members in the sample are non-default
  3. For each property, run Equals the sample instance to a modified sample instance with that property set to the value from a default instance.

If step 2 fails, it means there’s a member that’s still at its default value in the sample instance, e.g. the test wasn’t updated after a new property was added. If step 3 fails, the sample was updated, but the new property is not considered in Equals – and it can even tell which property is missing.

The same problems of course arise with GetHashCode, but are usually less severe. Forgetting to add a property just makes collisions more likely. It can be tested much in the same way, but can potentially lead to false positives: collisions can occur even if all properties are correctly considered in the function. In that case, however, the sample can usually be altered to remove the collision – and it is really unlikely. In fact, we never had a false positive.

Thinking in immutability

The way I learned programming is dictated by objects and states. Besides advantages and on going efforts in the industry I couldn’t help but thinking: immutability is nice. I can use it in some cases and keep it quietly stored in the corner. But it didn’t remain silent.

The way I learned programming is dictated by objects and states. According to my thinking data is packed into objects which are later modified to reflect the changes over time. State and modification are a central modelling technique. For me programming and OOP in particular resolved around this common theme. Mutating objects pervade my thinking even beyond the code into the database and even the architecture of the whole system.
Besides advantages and on going efforts in the industry I couldn’t help but thinking: immutability is nice. I can use it in some cases and keep it quietly stored in the corner.
But it didn’t remain silent.
So I asked myself: How do you construct programs that build upon immutability? How do you (mostly) avoid mutable objects? How do you think in immutability?
The first step was to unlearn. No updates. No modifications. Read, create, copy. That’s about it. No more CRUD only CR. No more SQL updates, only inserts.

Events and logs

To illustrate I use a simple example. Creating, moving, translating and deleting a point. In the traditional OO way it looks like this:

Point p = new Point(40, 30);
p.translateXBy(5);
p.moveTo(10, 20);
p.delete();

Or using SQL might be something like this (omitting primary keys and where clauses here):

insert into points (x, y) values (40, 30)
update points p set p.x = p.x + 5
update points p set p.x = 10, p.y = 20
delete from points

In our memory (or database if we use one) every line updates our point:

Point p = new Point(40, 30); // p = {x: 40, y: 30}
p.translateXBy(5); // p = {x: 45, y: 30}
p.moveTo(10, 20); // p = {x: 10, y: 20}
p.delete(); // p = ?

But what if we do not store the results of the operations but the operations themselves? The events.
Imagine your state changes as a series of events. Just imagine.

new PointCreated(40, 30); // pointEvents = [{created[x: 40, y:30]}]
new PointTranslatedXBy(5); // pointEvents = [{created[x: 40, y:30]}, {translated[x: 5]}]
new PointMovedTo(10, 20); // pointEvents = [{created[x: 40, y:30]}, {translated[x: 5]}, {moved:[x: 10, y:20]}]
new PointDeleted(); // pointEvents = [{created[x: 40, y:30]}, {translated[x: 5]}, {moved:[x: 10, y:20]}, {deleted}]

Even in the database we would just use inserts, no more updates and no more deletes. The events are stored in a log (ironically the database does this the same way). A log is a fully ordered, append only queue. Once we use and store events we have some extras besides immutability: an audit trail, an undo stack, recovering, …
We could externalize the event stream in a message queue and could monitor it, replay it to reproduce bugs, distribute it. The possibilities are endless.

But. That’s all nice and fine. I have one more question: what’s the current state? A user should see the current state and other parts of the system also (not mentioning that I – coming from a mutable state kind of thinking – would also feel better seeing it).

So what’s the current state?

All events applied in order.

OK. But isn’t this expensive doing this all the time?

Yes!

Here another concept from databases helps us: materialized views. We can easily translate in our mind between the new immutable event driven way and the old in place update way. It is just the same data in different representations (if we only are interested in the current state). If we store the current state as a materialized view (or cache) besides the event log we can have both.
Every part of the program which needs the current state gets an immutable copy of it. If this part needs to know when something changes, it can observe the events and act accordingly. This way mutability is pushed to the borders, to the parts where the current state is shown (like the UI layer).