Before you start getting mad at me first a disclaimer: I really think you should adhere to the DRY (don’t repeat yourself) principle. But in my opinion the term “code duplication” is too weak and blurry and should be rephrased.

Let me start with a real life story from a few weeks ago that lead to a fruitful discussion with some fellow colleagues and my claims.

The story

We are developing a system using C#/.NET Core for managing network devices like computers, printers, IP cameras and so on in a complex network infrastructure. My colleague was working on a feature to sync these network devices with another system. So his idea was to populate our carefully modelled domain entities using the JSON-data from the other system and compare them with the entities in our system. As this was far from trivial we decided to do a pair-programming session.

We wrote unit tests and fixed one problem after another, refactored the code that was getting messing and happily chugged along. In this process it became more and more apparent that the type system was not helping us and we required quite some special handling like custom IEqualityComparers and the like.

The problem was that certain concepts like AddressPools that we had in our domain model were missing in the other system. Our domain handles subnets whereas the other system talks about ranges. In our system the entities are persistent and have a database id while the other system does not expose ids. And so on…

By using the same domain model for the other system we introduced friction and disabled benefits of C#’s type system and made the code harder to understand: There were several occasions where methods would take two IEnumerables of NetworkedDevices or Subnets and you needed to pay attention which one is from our system and which from the other.

The whole situation reminded me of a blog post I read quite a while ago:

https://www.sandimetz.com/blog/2016/1/20/the-wrong-abstraction

Obviously, we were using the wrong abstraction for the entities we obtained from the other system. We found ourselves somewhere around point 6. in Sandy’s sequence of events. In our effort to reuse existing code and avoid code duplication we went down a costly and unpleasant path.

Illustration by example

If code duplication is on the method level we may often simply extract and delegate like Uncle Bob demonstrates in this article. In our story that would not have been possible. Consider the following model of Price and Discount e-commerce system:

public class Price {
    public final BigDecimal amount;
    public final Currency currency;

    public Price(BigDecimal amount, Currency currency) {
        this.amount = amount;
        this.currency = currency;
    }

    // more methods like add(Price)
}

public class Discount {
    public final BigDecimal amount;
    public final Currency currency;

    public Discount(BigDecimal amount, Currency currency) {
        this.amount = amount;
        this.currency = currency;
    }

    // more methods like add(Discount<span 				data-mce-type="bookmark" 				id="mce_SELREST_start" 				data-mce-style="overflow:hidden;line-height:0" 				style="overflow:hidden;line-height:0" 			></span>)
}

The initial domain entities for price and discount may be implemented in the completely same way but they are completely different abstractions. Depending on your domain it may be ok or not to add two discounts. Discounts could be modelled in a relative fashion like “30 % off” using a base price and so. Coupling them early on by using one entity for different purposes in order to avoid code duplication would be a costly error as you will likely need to disentangle them at some later point.

Another example could be the initial model of a name. In your system Persons, countries and a lot of other things could have a name entity attached which may look identical at first. As you flesh out your domain it becomes apparent that the names are different things really: person names should not be internationalized and sometimes obey certain rules. Country names in contrast may very well be translated.

Modified code duplication claim

Duplicated code is the root of all evil in software design.

— Robert C. Martin

I would like to reduce the temptation of eliminating code duplication for different abstractions by modifying the well known claim of Uncle Bob to be a bit more precise:

Duplicated code for the same abstraction is the root of all evil in software design.

If you introduce coupling of independent concepts by eliminating code duplication you open up a new possibility for errors and maintenance drag. And these new problems tend to be harder to spot and to resolve than real code duplication.

Duplication allows code to evolve independently. I think it is important to add these two concepts to your thinking.

2 thoughts on “Code duplication is not always evil”

Dan says:

April 1, 2022 at 1:48 pm

Thanks for sharing your experience of a scenario where trying to reuse code caused problems. However, I don’t think your example really leads to the conclusion that “not all code duplication is evil”.

It depends on how your define “duplicate code”. If two different abstractions have code that looks the same, is it really duplicate? I would argue the code in the scenario you describe, is not really duplicate because it represents two different things that happen to have same attributes I.e. two different abstractions, as you point out.

Therefore, while your observation is correct, I would argue that your conclusion that some duplicate code is acceptable isn’t correct given a sufficiently qualified definition of ‘duplicate code’. I’d argue “code that looks similar” isn’t sufficient. A possibly more sufficient definition might be “code that describes the same behaviour or entity”. Your scenario falls outside of this definition as it involves code that describes different entities.

1. Miq says:
  
  April 1, 2022 at 2:05 pm
  
  Many thanks for sharing your thoughts about the topic and I see your point.
  
  Nevertheless, I have practible problems with altering the definition of duplicate code:
  
  * Our code analysis tools only support the insufficient definition of duplicate code
  * Millions of developers out in the wild know and share that insufficient definition
  
  I think it remains important to have the simple notion of “duplicate code” to have something to elaborate on as I tried with my article and you with your thoughtful comment.
  
  Thanks again!

	daniel.lindner on AI Code Won’t Be for Humans Mu…
	Simon on AI Code Won’t Be for Humans Mu…
	Miq on AI Code Won’t Be for Humans Mu…
	AI Code Won’t Be for… on Impressions of Our Current AI…
	Impressions of Our C… on AI Code Won’t Be for Humans Mu…

The story

Illustration by example

Modified code duplication claim

Share this:

Related

2 thoughts on “Code duplication is not always evil”

Leave a comment Cancel reply