Renaming is hard work

Imagine you’ve developed a software solution for one of your customers, for example a custom CRM. Everything in this system revolves around the concept of a customer as the key entity. Of course, this is reflected in your code as well. The code is crowded with variable and class names like customerThis and CustomerThat. But now your customer wants you to change the term customer to client.

The minimal and lazy solution would be to change only those occurrences of the word, which are visible to the users. This includes:

  • Texts displayed in the user interface. If your application is internationalized or at least prepared for internationalization, they are usually neatly contained in translation files and can be easily updated.
  • User documentation like manuals. Don’t forget to update the screenshots.

However, if you do not want to let the code diverge from the language of the domain, you have to rename and update a lot more:

  • identifiers in the code (types, namespaces/packages, variables, constants)
  • source files and directories
  • code comments
  • log output strings
  • internationalization keys
  • text protocol strings (e.g. JSON keys) and HTTP API routes/parameters. Of course you can only do that if you are allowed to break the protocol! Don’t accidentaly break your API or formats, e.g. if JSON keys or XML tags are generated via reflection from code identifiers.
  • IDs of HTML tags in views, CSS class names and selector strings in referencing Javascript code
  • internal documentation (e.g. Wiki)

IDE support for renaming can help a lot. Especially the renaming of identifiers is reliable in statically typed languages. However, local variables, function parameters, and fields usually have to be renamed separately. In dynamically typed languages automated renaming of identifiers is often guesswork. Here you must rely on a good test coverage. Even in statically typed languages identifiers can be referenced via strings if reflection is used.

If you use a grep-like tool or the search feature of your IDE or editor to assist you, keep in mind that composite terms can occur in many different forms like FooBar, fooBar, foo-bar, foo_bar, foo.bar, foo bar or foobar. Don’t forget about irregular plural forms, e.g. technology – pl. technologies. Also think of abbreviations, otherwise a

for (Option opt : options)

might end up as

for (Alternative opt : alternatives)

But this is not the end of the story. A lot of renamings require migration scripts or update scripts to be executed at the time of the next update or deployment:

  • database tables and columns
  • some ORMs like Hibernate store class names in the database for the “table-per-hierarchy” mapping
  • string-based enum values in database entries
  • configuration keys and values
  • keys and values in data formats (JSON, XML, …) of stored data. You probably have to provide backward-compatibility for the old formats if the data files are not stored all in one place and can’t be converted in one go
  • names of data folders

If you don’t rename everything all at once, but decide to use the new term only in newly written code and to leave the old term in the existing code until it eventually gets replaced, you will probably end up with a long transition phase and lots of confusion for new project members who don’t know about the history of the project.

Software development is code organization

The biggest problem in developing and maintaining software is understanding code. Software developers should get good training in crafting code which can be understood. To make sense of the mess we need to organize it.

In 2000 Edsger Dijkstra wrote about our problems organizing and designing software:

I would therefore like to posit that computing’s central challenge, “How not to make a mess of it”, has not been met. On the contrary, most of our systems are much more complicated than can be considered healthy, and are too messy and chaotic to be used in comfort and confidence.

Our code bases get so big and complicated today that we cannot comprehend them all at once. Back in the days of UNIX technical constraints led to smaller code. But the computer is not the limiting factor anymore. We are. Our mind cannot comprehend what we create. Brian Kernighan wrote:

Everyone knows that debugging is twice as hard as writing a program in the first place. So if you’re as clever as you can be when you write it, how will you ever debug it?

Writing code that we (or other developers) can understand is crucial. But why do we fail?

Divide and lose

Usually the first argument when tackling code is to decouple it. Make it clean. Use clean code principles like DRY, SOLID, KISS, YAGNI and what other acronyms you know. These really help to decouple. But they are missing the point. They are the how not the why.
Take a look at your codebase and tell me where are the classes which constitute a subdomain or a specific feature? In which project or part do they live?
Normally you cannot. We only know how to divide code by technical aspects. But features and changes often come from the domain, not from the technology.
How can we understand our creations when we cannot understand its structure? Its architecture? How can we understand something we do not see.

But it does work

The next argument is not much better. Our code might work now. But what if a bug is found or a new feature is about to be implemented? Do you understand the code and its structure? Even weeks, months or years later? Working code is good but you can only change code reliably that you understand.

KISS

Write simple code. Write simple and small methods. Write cohesive classes. The dream of components. But the whole is more than the sum of its parts. You can write simple classes but the communication and threading issues between them can be very complex. Even if the interfaces are sound and simple. Understanding a simple class can be easy in isolation. But understanding a system of simple classes can be difficult and complex. Things are complex. Domains are complex. We cannot ignore that.

Code as an interface

When writing code we have to take the reader and the domain into account. Treat code as an interface. An interface to the system and the domain. It is an opinionated view of the world. The computer does not care about the code we use. Just like the printer who prints our favorite book. But the reader does.
This isn’t just nice thinking, understanding code is key to successfully crafting and changing software.