In object-oriented programming languages like Java, the compiler will improve its helpfulness if the application provides a rich type system or strong domain model. There is a whole field of study for type systems, called type theory, that is fascinating and helpful, but does not provide easy rules to follow for beginning software developers. This blog entry proposes a simple set of rules for a specific part of type systems (associations among types) that can be applied to a domain model as a rule of thumb. The resulting model will empower the compiler and the code completion of the IDE to help the developer with writing correct code.
Data knows data
Even the most basic domain models separate the data in multiple entities (often classes). For example, an employee class has an internal id, but knows about a person class and a salary class that are associated with this employee. This “knowing about” is modeled as a reference to a person object and a salary object. In this case, the reference is probably of the type “one”: The employee object knows about one person object and one salary object. This is the usual way to structure data.
If you learn about the UML notation of data models, you’ll see that associations (aka references between objects) are given great emphasis. There are several different kinds of associations that can be customized by multiplicities and such. It seems that knowing other data is a complex issue for types. It doesn’t have to be this way. Here are four ways of knowing other data that are sufficient for nearly every use case: Zero, Maybe, One and Many.
Four basic types of association
- Zero: Knowing zero elements of something different is the usual default case: Your employee object probably doesn’t need to know about the payroll object of the company and therefore has no association to it. This means that there is no member variable of the type Payroll in the class Employee. No developer ever modeled a “zero” association by declaring a member variable and setting it to null. This would be ridiculous. We just omit the member variable and are done. Knowing zero elements of something is easy.
- One: Yes, I’ve omitted Maybe at the moment. I’ll come back to it. Knowing one element of something is also not hard: You declare a member variable of the type, give it a good name (that’s the hard part!) and ensure that every instance of your class (every object) has a valid reference to an object of something’s type. If you call methods on this reference, you call methods on the object you know. As long as you live, the other object cannot disappear. Knowing one element of something is a long-lasting relationship.
- Maybe: Sometimes, you want to know an element of something that isn’t there yet or you knew an element of something once, but it is gone. You know “maybe one” element of something. These associations are typically programmed in a cumbersome way by many developers. Instead of embedding the “maybe” aspect in the type system and giving the compiler a chance to help, it is burdened solely onto the developer’s shoulders by implementing the “maybe” like a “one” with the added possibility of a null reference if the element isn’t there. A direct result of this approach are null-checks in the code or NullPointerExceptions at places without such checks. One possibility to elevate the “maybeness” into the type system is to implement the association with a Maybe or Optional type. Instead of referencing a Salary directly that might be null if an employee isn’t salaried anymore, the Employee class references an Optional<Salary> object. This object might “contain” a salary or it might not. With a few adjustments to the conditional flow of the code, this distinction between “something is there” and “something is not there” doesn’t matter anymore. If the code is free from implicit Optional types (references that can be null), a whole category of bugs disappears and the code is freed from manually programmed type system checks. Probably knowing one element is the type of assocation that requires some thought and is often done on the wrong level.
- Many: As soon as you want to know more than one element of something, you fall into the “many” category. Many-associations are not so easy to handle, because there are so many possibilities to express them. The basic types are arrays or lists. My recommendation is to use lists whenever feasible and only resort to arrays if it is necessary, because arrays are fixed-length and have the same problem of maybe-null-references: An array index might have been written yet or not. If you refrain from storing null references into lists, they express their filling level a lot clearer than arrays. And given advanced features like iterators, there isn’t even a need to ask for the filling level. An interesting observation is that the list-based many-association can also serve as a zero-, maybe- or one-association. It is possible to replace all other types of association with lists. You probably won’t want to do this, because with the maximization of multiplicity flexibility comes more complexity and reduced readability of the code. You should strive to minimize complexity. Only add many-associations if you really need them. Even just replacing a “maybe” (Optional) with a “many” (List) is a source of much unwanted code and uncertainty.
Advanced types of association
Of course, there are many more types of association that you’ll eventually need. A good example is the qualified association, often implemented by a Map/Dictionary that translates from the qualifier type to the qualified type. But they are rare in comparison to the four basic types.
If you get your basic associations right, your domain model will help your compiler and IDE to support and guide you. This is an upfront investment that pays off manyfold over the course of the project and eliminates the burden of attention to detail when it comes to accidental complexity like null pointers. Your project’s domain probably doesn’t contain null pointers, but the concepts of knowing zero, maybe, one and many.