Equality is a subtle and thorny business, in programming as well as in pure mathematics, physics and philosphy. Probably every software developer got annoyed somtime by unexpected behaviour of some ‘equals’ method or corresponding operators and assertions. There are lot’s of questions that depend on context and answering them for some particular context might cost some pain and time – here is a list of examples:
- What about objects that come with database-ids? Should they be equal for the objects to be equal?
- Are dates with time zones equal if they represent the same instant but have a different time zone?
- What about numbers represented by functions that compute digits up to a given precision?
This post is about applying an idea of Leibniz I like, to the problem of finding good answers to the questions above. It is called “Leibniz’ law” and can be phrased as a definition or characterization of equality:
Two objects are equal, if and only if, they agree in all properties.
If you are not familiar with the phrase “if and only if”, that’s from mathematics and it is a shorthand for saying, that two things are true:
- If two objects are equal, then they agree in all properties.
- If two objects agree in all properties, then they are equal.
Lebniz’ law is sometimes stated using mathematical symbols, like ““, but this would be besides the point of this post – what those properties are will not be defined in a formal mathematical way. If I am in doubt about equality while programming, I am concerned about properties relevant to the problem I want to solve. For example, in almost all circumstances I can imagine, for a list, a relevant property would be its length, but not the place in the computers memory where it is stored.
But what are relevant properties in general? For me, such a property is the result of running some piece of meaningful code. And what meaningful code is, depends on your judgement how the object in question should be used. So in total, this boils down to the following:
Two instances of a type are equal, if and only if, they yield the same results in any meaningful piece of code.
Has this gotten us anywhere? My answer is yes, since the question about equality was reduced to a question about use cases of a type, which might be a starting point of defining a new type anyway.
Turtles all the way down
Please take a moment to note what a sneaky beast equality can be: Above I explained equality by using equality – right where I said “same results”. It is really hard to make statements about anything at all without using some notion of equality in some way. Even in programming, where you can freely define when two objects are equal, you can very well forget that you are using a system, namely your programming language, which usually already comes with an intricate notion of equality defined on the syntax you are using to define your notion of equality…
On a more practical note, that means that messed up notions of equality usually propagate if you define new kinds of objects from known ones.
Relation to Liskov’s Principle
With our above definition, we are very close to an informal interpretation of Liskov’s Substitution Principle, which we can rephrase as:
In all meaningful code for a type, an instance of a subtype has to behave the same way.
For comparison, the message of this post stated in the same tongue:
In all meaningful code for a type, two equal instance of the type should behave the same way.