Computing gets fuzzy again (AI impressions, part 1 of 5)

This is the first part of the series “Impressions of Our Current AI Usage”, as outlined by the introduction article.

In the early days of computing, the mechanism that actually works on the data often was an analog technical device that had a certain kind of fuzzyness to it. Think about paper tape with punched holes as longterm storage: If the paper feeder was not aligned with the distance between the holes, there might be spurious variations in the code. Or, a real possibility from my own childhood: You could store digital data on music tape, an inherently analog storage medium. If the loading process succeeded relied on a mixture of patience, delicate handling, room temperature and luck. Most computing devices had specific analog/digital conversion gateways for the periphery, for the display (a very mechanical cathode ray monitor) and even for their own calculation units. The foundation we built our digital world on was influenced by sunshine, moisture, electrical isolation and lots of other factors that could influence the results. I remember a story about the early mainframe computers where a specific bug only appeared if somebody stepped on the physical floor tile where the cables ran beneath. The pressure change altered the physical properties of the cables which resulted in transmission errors.
Over the years, the physical aspects of computing slowly went away or at least faded into the background. We no longer joked about “cosmic ray errors” because the computing substrate was reliable enough to produce the same result regardless of environmental influences. The world got repeatable and therefore, predictable. We got comfortable with machines that were dumb, but reliable. If they had learnt a functionality, they could repeat it virtually forever, without the slightest variation. We had the precision of a nanosecond clockwork and the determinism of a written story that plays out the same every time it is read.

In the early 1990s, there was the first attempt to soften this black-or-white logic fabric up again. The term “fuzzy logic” was all the hype for a few years. Products like cameras, coffee machines, toasters and even water boilers were marketed as “enhanced by fuzzy logic”. How exactly the coffee got better by miniscule variations in the production process was up to your imagination. The core belief of fuzzy logic was that if we express a formula or algorithm by categorized terms instead of numbers, we could bridge the gap between “gut feeling” and digital mathematics.

In my opinion, the same thing happens again with artificial intelligence as the fuzzy component. I doubt that it disappears as thoroughly as fuzzy logic vanished, but the core belief seems to be the same. If you describe a problem in layman’s terms to an “inference”, it finds a solution that appears to be acceptable. If you describe the same problem again tomorrow, the solution might vary in detail or even in grand concept. What works today might not work anymore tomorrow or work even better. The quality of results rely on “the environment” again, not only on the input. The operating units of computing cease to be deterministic again. Computing gets fuzzy once again.

There are some immediate problems that I see with this approach:

  1. Every quality promise comes with severe limitations: The machine will work as expected today, but it is unclear if that extends very far into the future. The current results vary a bit, but might vary tremendously going forward. If the inference unit isn’t included into the product, it might not work anymore soon. Or it works noticeably different from now.
  2. The machine might change its personality on a whim. This is a problem already with encompassing updates every now and then. My smartphone itself stays the same, but the graphical presentation, usage paths and functionality changes over the course of months, if not weeks. In a world where we are used that a stone acts like a stone, a kitchen timer stays a kitchen timer and a text editor doesn’t turn into an e-mail client, we begin to lose that certainty. Our digital assistants begin to have “phases” with decreased alignment to our use cases. Or, expressed as a positive, we can hope that our digital assistants get to know us better and tune themselves in to us.
  3. We enter a world with limited transferability. One benefit of strict specifications is interchangeability. If you change one capacitor in music electronics, the sound changes (or so they claim). If you change one transistor in a digital circuit, the result stays exactly the same, because the change doesn’t cause enough variance to toggle from “black” to “white”. If the building blocks if your system are less specified digital entities like inference providers, you can’t exchange one against another without possibly altering the system’s behaviour in a noticeable way. This makes the reproducability of an equal system with slightly different components more of an adventure. You just don’t know beforehands that it will work.

There are probably more problems and maybe a lot more advantages to this approach than I can fit into one blog post. My main point is that we layered a strict, digital computing substrate on a messy, analog electronics layer and now put another layer of blurry looseness on top of it. Building future systems on this level might feel like engineering the analog systems of the past. I find it interesting (and ironic) that we try this approach right the moment when the last analog technology heroes step back and take their expertise with them.

Impressions of Our Current AI Usage (part 0 of 5)

There is a lot of hype, noise, love and opinion about the use of artificial intelligence (in all its different forms) in software development. Of course, similar disturbances happen in other markets and academic fields at the same time, but I’m not qualified enough to participate in discussions there.

I feel confident enough to share my impressions on our current usage patterns of AI here. You probably recognize the amount of limitations I put into my statement. The usage patterns evolve quick and still quite radical. I’m no “AI native”, so all I say are just impressions from a certain distance. But I felt confident enough in software engineering for at least 25 years to teach it to the next generations of developers. So I know where we were when it all started.

My impressions will be described in detail in five blog posts, each discussing one specific topic. This is the starting post that introduces the headlines of the following articles, but won’t detail them. If you want to react and comment on a topic, please attach it to the matching blog post so we can keep the discussion on point. I invite you to think along, starting with the headline statements. My thoughts are worthless without your thoughts enriching them with your knowledge and experience.

Let’s have a look at the five impressions:

  1. Computing gets fuzzy again. The components of software systems were never sharply defined, but with AI they tend to act like analog components, having bad days and noisy episodes and all.
  2. Source code gets obscure again. As soon as the AI surpasses the imitation stage of human-written code, we won’t be able to read the generated source code anymore – if the AI bothers to generate source code at all and doesn’t leap to machine code directly.
  3. Software developers don’t create software anymore, they manage and lead software creators. This was the fate of the “senior developer promoted to middle management” all the time, but at least it was humans to lead and manage and not a people-pleasing machine.
  4. We delegate the scalable and fun part of our work to AI. The infamous “10X developer” is now a “1000X AI developer”, but the tedious rest of the work (that exists and makes all the long-term difference) is still up to humans.
  5. The means of production are centralized again. Software development was a profession with incredibly low entry bar (a notebook and a coffee). The actual difference was the skill of the human that did the (mostly intellectual) work. If we all use the same AI (created and provided by infrastructure no single person could just copy), the skill difference will be much smaller and we tend to be interchangeable “workers”.

I don’t expect you to understand my thoughts by just two sentences alone, so stay tuned for the elaborate explanation in the topic-based blog posts of this series.

Topic 1 and 2 focus on technical aspects of software development. Topics 3 and 4 have the remaining human developer in mind, gauging her or his well-being and the skills needed to master a normal workday. Topic 5 broadens the view to economic and even political implications of the changes.

Topic 5 is where I might be wrong the most, because I lack the experience of living through severe changes on the scale that my anticipated changes operate on. I’m not a historian, so I’ll talk about things I only have wikipedia-level knowledge about. I hope that the thoughts are still useful and somebody can provide more content on the topic.

The topic-based blog entries are published in the next weeks or months. I will link them in the list above as soon as they are online. I really appreciate your thoughts, in the form of your own blog entry or a comment.

When Optional sounds too optional: opt for more expressive types

So, I have one PyQt application which not only is quite data-heavy, but also has significant real-time requirements, as well as multiple windows. This construct brings some absolutely horrifying highly intellectually inspiring quests with it, and Python turned out to be kind of a good decision for that project, because that is one of the languages where, when you think about your structure a bit, you might get to write very natural-sounding like code.

Of course, the following idea is actually language-agnostic, I will just use fictive Python examples close to problems-based-on-a-true-story.

This in itself is not only a matter of aesthetics, but because real-time demands are quite tricky to reliably be covered by unit tests alone, the actual code has to read itself so clearly that one does not need to second-guess what any of this does. Think of a bedtime story, which usually would not, coming to think of it, contain clauses – or paragraphs, for that matter – requiring, under circumstances not even trivial to the human eye, one kind of meticulous gymnastics, easily negating twice, or thrice, and relying on Python’s borderline criminal degrees of freedom in duck typing, or canards even– you see — your toddler will now not go to sleep anytime soon. Or trust you with another story, for that matter.

Now I found out: While Qt is somewhat mature, one cannot even trust their way of doing things – i.e. turns out, the signals/slots system is not particularly designed for performance. Neither did I feel inclined to put my faith into even another state management solution like e.g. python-statemachine package, because – as capable as that sounds, it might be overkill, and distracting with its own idiosyncrasies (as also: I would not recommend Redux for a web project anymore, especially in TypeScript, except for you really know from the start that this is a good fit).

But, so, I have some tricky interplays between

  • Data consistency / single-source-ness demands that e.g. between two windows, there should only be primitive data exchanged, say str/int identifiers, and both have access to their repositories; not throwing loaded data sets around my memory in order to go stale at times
  • Comprehension, most significantly Single Level of Abstraction, or other indicators of mental load like how many levels of intendation / return paths are mixed within sight (and also, Type Annotations do help a lot in Python, even though not mandatory, i.e. the complete opposite of fighting Redux-TypeScript-chimaeras – but I digress…
  • Robustness, where I would believe that my user (me) has virtually no chance of even seeing this and that window when their data is maybe still loading somewhere – but I still check these cases, because this bedtime story has no business in leaving you an hopeful-to-anxious pile of nerves
  • Traceability of your state, for troubleshooting and useful UI feedback (as you’d guess, real-time event based stuff is not easily debugged by break points or logging alone).

So over months in that project, I grew annoyed of code like (Symbolbild)


class Editor:
    # ...
  
    def load_editor(self, params: Optional[EditorParams]):
        if params and (self._entity is not None or
                       self._entity.id != params.id):
            if entity := repository.load_entity(params.id):
                self._entity = entity
            else:
                raise ValueError("repository needs some alone time :(")
            self._entity.other_stuff = other_repo.check_stuff()
        elif params is None:
            raise TypeError(
                "sounds Optional in our signature, but actually is not"
            )
        elif self._entity.id == params.
            self.adjust_more_stuff(self._entity, params.stuff)
            # ...

Because encountering any single block of these drags you down, I have currently accustomed myself to write these as (one can argue whether the names like “Supplier” are the best here, but they’re not the worst, I believe)

@dataclass(frozen=True)
class LoadedEntity:
id: str
entity: Optional[Entity]
stuff: Optional[OtherStuff]
@property
def is_unusable(self):
return self.entity is None
@property
def missing_stuff(self):
if self.is_unusuable:
return True
else:
return self.stuff is None
class EntitySupplier:
_current: LoadedEntity
_entity_repo: EntityRepository
_stuff_repo: OtherStuffRepo
# __init__ etc. hereby left out as boilerplate
def load_params(self, params: Any):
# do all your checks in here
if (... very bad ...):
self._current = LoadedEntity(params.id, None)
return
entity = self._entity_repo.get(params.id)
stuff = self._stuff_repo.get(params.stuff).for(entity)
return LoadedEntity(
params.id,
entity,
stuff
)
@property
def entity(self):
return self._current.entity
def expecting(self, stuff: bool = False) -> Optional[LoadedEntity]:
if stuff and self._current.missing_stuff:
return None
return self._current
class Editor:
_supply: EntitySupplier
_logger: SomeLogger
def __init__(self, **kwargs):
self._supply = EntitySupplier(**kwargs)
self._logger = BlaBlaLogger()
def load(self, params):
self._supply.load_params(params)
if entity := self._supply.entity:
self.update_ui(entity)
else:
self._logger.error("Outsmarted, eh? %s | %s", str(params), stack_trace())
return
if supply := self._supply.expecting(stuff=True):
self.initiate_stuff_from(supply)
else:
self._logger.info("Entity %s is ready, Stuff is not | %s", str(entity), stack_trace())

So, the LoadedEntity serves like a concatenation of several Optional types, but it wraps the logic (i.e. there’s no sense in having stuff when you don’t have entity first) instead of just shruggingly claming “well, this entity here is optional – and that other stuff is, too”. Now, LoadedEntity is not a pretty name at all (have a better one?), but it sure beats having two straightaway lies.

I like that pattern because it allows me to stash the EntitySupplier and LoadedEntity somewhere on their own (I do strictly not believe that every class needs its own file, but some of the “Single …” ideas (Responsibility, Level of Abstraction, you name it) do also apply here; and the Editor.load(…) itself does read somewhat like a short story. It has quite linear structure and can early-return, and/or log, on demand, and while naming is still hard (consistently voted one half of famous Hard Things), I could even have some fun in designing that language while preserving the idea, that future-me can arrive in a few weeks (read: hours) and still trust in some of the entites and stuff.

The quintessence here is: Checking for None (which is Python’s NULL, and the typing Optional[T] is identically equal to T | None) is still a thing in 2026 due to its sheer practicality, but if you design some some structure around that and keep these checks in something like LoadedEntity, you can keep the abyss from staring back into you.

Rails Strict Locals: Giving Partials an Explicit Interface

Rails partials are a great way to reuse view code, but they have traditionally suffered from one weakness: their interface is implicit.

When opening a partial written by another developer, it is often unclear which locals are required, which are optional, and whether all of them are still used. IDEs typically cannot help much either, often showing warnings about unresolved variables because they cannot determine where the values come from.

The problem becomes even more apparent as an application grows and partials are rendered from multiple places.

If a local is forgotten by call, the error only appears when the template is rendered:

undefined local variable or method `missing_local'

If an extra local is passed, Rails traditionally ignores it.

Over time this creates a situation where the real API of the partial exists only in the heads of the developers maintaining it.

Rails Strict Locals

Rails provides a feature called strict locals that allows a partial to declare its expected interface:

<%# locals: (title:, highlight: false) %>

The declaration resembles Ruby keyword arguments and is placed at the top of the template.

A local like title without a default value is required. Locals like highlight with default values become optional

The partial now documents and enforce its own API. If a required local is missing, Rails raises an exception instead of failing later when the variable is accessed. Likewise, if a caller provides a local that is not declared, Rails rejects it.

Conclusion

Strict locals do not fundamentally change how partials work, but they make them easier to understand and maintain.

By declaring the expected locals directly in the template, partials become self-documenting and gain an explicit contract with their callers. Missing locals are detected early, obsolete locals are rejected, and developers no longer have to search through controllers, parent templates, and render calls to understand where variables come from.

An additional benefit is improved tooling support. Once the interface of a partial is explicit, IDEs can understand the available variables much better. Your IDE becomes a helpful companion again rather than a source of noise.

A doubly linked list for entt components

I recently implemented a small CRTP template to group entities in entt. It turned out quite nicely, so let me share it here:

template <class T> class doubly_linked_component
{
public:
using node_type = doubly_linked_component<T>;
static void on_construct(entt::registry& entities,
const entt::entity e)
{
auto& that = entities.get<T>(e);
that.next_ = that.prev_ = e;
}
static void on_destroy(entt::registry& entities,
const entt::entity e)
{
auto& that = entities.get<T>(e);
// List has only this element?
if (that.next_ == e)
{
return;
}
auto next = that.next_;
auto prev = that.prev_;
auto& next_node = static_cast<node_type&>(entities.get<T>(next));
auto& prev_node = static_cast<node_type&>(entities.get<T>(prev));
prev_node.next_ = next;
next_node.prev_ = prev;
}
static void merge(entt::registry& entities,
entt::entity lhs, entt::entity rhs)
{
auto& lhs_node = static_cast<node_type&>(entities.get_or_emplace<T>(lhs));
auto& rhs_node = static_cast<node_type&>(entities.get_or_emplace<T>(rhs));
// The end of left (which is left.prev_) needs to point to right
auto lhs_end = lhs_node.prev_;
auto rhs_end = rhs_node.prev_;
auto& lhs_end_node = static_cast<node_type&>(entities.get<T>(lhs_end));
auto& rhs_end_node = static_cast<node_type&>(entities.get<T>(rhs_end));
lhs_end_node.next_ = rhs;
rhs_node.prev_ = lhs_end;
rhs_end_node.next_ = lhs;
lhs_node.prev_ = rhs_end;
}
static std::generator<entt::entity> enumerate(entt::registry& entities,
entt::entity e)
{
// By default, entities are their own lists
if (!entities.any_of<T>(e))
{
co_yield e;
}
else
{
auto current = e;
while (true)
{
co_yield current;
current = entities.get<T>(current).next_;
if (current == e)
break;
}
}
}
private:
entt::entity prev_ = entt::null;
entt::entity next_ = entt::null;
};

You can use it like this:

struct cool_group : doubly_linked_component<cool_group> {};

By default, each entity represents its own 1-sized group. To merge two groups:

cool_group::merge(entities, left, right);

To iterate over all entities, I am using the new std::generator and coroutines. You can use it like this:

for (auto entity : cool_group::enumerate(entities, head))
do_something(entity);

entt will automatically call into on_construct and on_destroy.

People nowadays usually avoid linked lists because of all the pointer chasing required to actually use them. The pointer chasing is not the problem though, the non-locality is. If you make sure your nodes are all allocated in memory close to each other, there is hardly a penalty. entt will usually do this if nodes are also allocated close in time, which is often the case if you want to group things, so this works nicely in that regard, too. Feel free to use this code under CC0.

Partial Indexes in PostgreSQL: Index Only What Matters

Indexes are one of the most effective tools for improving database performance. However, they come at a cost: they consume disk space, slow down write operations, and require maintenance. In many cases, a full index contains a lot of entries that are never used by the queries we want to optimize.

This is where PostgreSQL’s partial indexes become useful. A partial index contains only the rows that satisfy a specified condition. Instead of indexing an entire table, we can index only the subset of data that is relevant for our queries.

Consider a simple user table:

CREATE TABLE users (
    id BIGSERIAL PRIMARY KEY,
    username TEXT NOT NULL,
    active BOOLEAN NOT NULL
);

Suppose that most users are inactive, but our application frequently searches for active users:

SELECT *
FROM users
WHERE active = true
  AND username = 'alice';

A conventional index would cover all rows:

CREATE INDEX idx_users_username
ON users (username);

If only a small fraction of users are active, this index contains many entries that will never help this query.

A partial index can be defined as:

CREATE INDEX idx_active_users_username
ON users (username)
WHERE active = true;

Now the index contains only active users. The PostgreSQL query planner can use this index whenever it detects that the query condition implies the index predicate:

SELECT *
FROM users
WHERE active = true
  AND username = 'alice';

Because the query explicitly restricts the result set to active users, the planner knows that every matching row must be present in the partial index.

Why use partial indexes?

The obvious benefit is size. Imagine a table with ten million users, but only five percent are active. A conventional index stores ten million entries, while the partial index stores only five hundred thousand.

Smaller indexes provide several advantages: less disk usage, reduced memory consumption, faster index scans, lower maintenance overhead during INSERT and UPDATE operations.

In workloads where the indexed subset is significantly smaller than the table, these benefits can be substantial.

A common use case

Soft deletion is a frequent pattern in business applications:

CREATE TABLE orders (
    id BIGSERIAL PRIMARY KEY,
    customer_id BIGINT NOT NULL,
    deleted BOOLEAN NOT NULL DEFAULT false
);

Most queries ignore deleted records:

SELECT *
FROM orders
WHERE deleted = false
  AND customer_id = 42;

Instead of indexing all rows, we can focus on the rows that are actually queried:

CREATE INDEX idx_active_orders_customer
ON orders (customer_id)
WHERE deleted = false;

As the number of logically deleted rows grows over time, the index remains compact.

Limitations

Partial indexes are not a universal solution. The query must contain a condition that allows PostgreSQL to infer the index predicate. For example, the index

WHERE active = true

cannot be used for a query that only filters by username:

SELECT *
FROM users
WHERE username = 'alice';

The planner cannot assume that the result should contain only active users.

Another consideration is changing data distributions. A partial index is most effective when the indexed subset remains relatively small. If almost all rows eventually satisfy the predicate, the advantage largely disappears.

Conclusion

By indexing only the rows that are relevant to specific queries, they can reduce index size, and improve query performance. Whenever you notice that your queries consistently target a small subset of a large table, a partial index may be worth considering.

ADRs – software is more than code

The buzz around AI often only revolves around how fast and cheap it is to genereate code. And it is true: Developing small tools has gotten waaaayyyyy cheaper.

Software systems on the other hand are a lot more than code:

  • They are business requirements
  • They are tradeoffs
  • They are design decisions
  • They are built on incomplete knowledge
  • They are limited by technology
  • They are documentation

While AI may help in many items of the list, too, there is a lot of human interaction and need for human involvement. Humans need documents and other things to reason about.

Imho one important aspect are the decisions we made in the context, at the time. This is often overlooked and development teams and/or customers ask themselves later, why things were done the way they are now. Sometimes discussions go round in circles and discoveries of the past are rediscovered again and again.

ADRs to the rescue!

This is where Architectural Decision Records (ADRs) can help tremedously! They do not only describe the status quo of a system or feature but provide additional insight. Typical ADRs document the decisions taken in a structured and referencable way. Besides the decision itself they contain metadata like

  • An identifier
  • The people involved
  • The context
  • Several alternatives that were discussed
  • Pros, cons and consequences of each option
  • The reason for the decision

The catalog of ADRs usually grows gradually during the life of a software system. Like the system itself and its environment the ADR catalog is not static but dynamically changing:

ADRs can become invalid, they can be superseded by newer ones and may have other statuses like proposed, accepted and rejected. The identifier allows them to be referenced in other documentation, issues and code comments.

Over the years they provide a description of the journey the people involved and the software system took together. This journey may not have been a short and straight one but often had many twists and turns.

All that can be seen an tracked in the ADR catalog. The current team and customer can take a look at the journey and identify place to revisit or paths to avoid because it is documented in an appropriate way.

Do we need even more documentation than we have already?

Projects usually have a decent amount of documentation, so you may ask: Do we need even more? And how much effort should go into something like ADRs?

I think, ADRs offer different and very valuable information while being designed to be very lightweight. Most people use Markdown templates for there ADRs so you have a standard format making it fast to create and being skimmed over. MADR seems very popular and I also like aspects of tekiegirl/Archangels templates.

Defining your own template that fits your needs best based on those suggestions should not be that big of a deal. Neither is starting to use ADRs in your running or future projects:

Just document every decision that meets one or more of the following criteria

  • impacts the implementation
  • hard or expensive to reverse/change
  • topic comes up repeatedly in meetings
  • new members ask repeatedly about it
  • affects multile teams, services or systems
  • difficult to explain without the context
  • overrides or adopts decision by another team

Conclusion

ADRs are a lightweight documentation tool offering a unique and very useful perspective an a project/software system. You can start using them right away to get benefit and can reference them in several other places like issues, documentation, commits and code. Keeping them as close to the code as possible (e.g. in the same code repository) makes them easier to find.

There are and surely will be more and more tools that can use the information in your ADRs to help you reason about your system, its history and its future.

What Happens When We Don’t Listen to the Whole Album Anymore?

I have lectured university students on software engineering for 25 years now. There are some things that changed over time, some for the better, some for worse. But one aspect worries me: The rise of buffet-style knowledge.

Let me explain what I mean by that term: In one of his books, the legendary physicist Richard Feynman describes a group of highly educated students that could recite every law of physics and all the details of materials, but were unable to act on this knowledge by combining some facts to come up with a solution to a common real-world problem. They ingested all the data, but didn’t digest it. It never amalgamated into a box of mental tools that could be applied to a problem just by thought experiment.

I recognize this pattern in my students, too. One example was working with a protocol that sends characters over a (physical!) wire. Each command was prefixed with an exclamation mark, followed by the mnemonic (an odd word, meaning a garbled mess of characters without innate meaning) and then the line ending. A typical specification for a command looked like this:

! QUIT <CR> <LF>

We approached the implementation by writing tests first, and sure enough, half the students asserted for the existence of a literal “<CR><LF>” at the end of the line. Not the two characters “Carriage Return” and “Line Feed”, but the eight characters as seen. When I asked them if they know about character encodings and the ASCII code, they felt well versed in both topics.

After we combined their tests with the real client implementation, they saw the failed assertions, but couldn’t see their mistake. The real client was lacking the latter half of the command line in their mind. They were amazed when they discovered that there are characters that you just cannot see right away.

They studied all the characters that they saw and just assumed that was all there is. The simple question “how does a text editor know when a line of text is over?” perplexed them. They just never stopped to think about how this thing actually works.

My theory about the origin of this symptom is double tracked: Richard Feynman argued that the type of knowledge tests that the students have to endure is the root cause. My sample size is rather small, but I can see that being a big influence. If the tests ask for connections between different pools of knowledge, the students are forced to link their knowledge. Those students that are unable to digest the knowledge until it becomes a mental tool instead of just a reproducible fact tend to perish. If a test just asks for the reproduction of one topic, the digestion part of learning is an optional bonus on top of the study requirements.

Returning to our example above: If I ask for the reproduction of unit tests and another question about character encodings, both questions can be answered without knowledge about control characters (not visible, but still present).

If I combine both questions and ask for a correct assertion about the length of the quit command (7 characters), I can test who is able to write unit tests and who doesn’t know about control characters and asserts for 13 characters. This type of questions (that requires knowledge transfer or fusion from several topics at once) is actively discouraged in today’s exams.

But the second track of my theory is about the means of modern knowledge consumption. We don’t eat full knowledge meals anymore, we pick the flashy bits and skip the rest. If we could learn by just listening to music, we would skip three songs, fast forward the fourth to the exciting part and then ignore the rest of the album. Compare that to the days of linear music storage, you were heavily nudged to listen to the whole album front to back. And while listening to the “other” songs, two things could happen that are missing from the picky approach: We had time to appreciate the exiting part even more and we could be surprised by a song that might be even better than the one we anticipated. Our music portfolio was not only curated by us, but by the artist, too.

Transfer this to software engineering and my grief can be retyped into: Nobody reads whole books about a software topic anymore. In fact, I had several students acting aghast when I suggested they should read a book in order, front to back. To them, that was like wasting time with filler material. The thought that this “filler” might be a source of surprise, inspiration and additional curiosity never crossed their mind before.

I get the comfort of quick answers from stack overflow, youtube videos or a chatbot AI. I see the instant gratification nature of going on a highlight-driven journey through nearly all topics of modern programming. But we aren’t creatures that thrive and prosper on instant gratification. We don’t learn from quick success. We learn by trial and repetition. And we can’t cheat our biological heritage (at least not yet).

So, what is my point? I think that “broad knowledge”, the ability to combine different aspects in thought experiments and slow, creative learning will be more important in the future, especially with the availability of a talking encyclopedia right in front of us that can fill the minor gaps faster than we can articulate the question. But we need to know what to ask, and even more important – why we ask.

Avoiding Code Style Discussions

Every developer has personal formatting preferences.
Brace placement, line wrapping, imports, tabs vs. spaces — everybody has an opinion, and most of them are reasonable.

The problem starts when all these styles meet in one repository.

The cost of “personal style”

A codebase written by ten developers can easily look like ten different applications stitched together. Suddenly, pull requests are full of formatting changes. Git diffs become noisy. Merge conflicts appear because one developer reformatted a file differently than another. Code reviews drift into discussions about whitespaces instead of actual functionality.

Even worse: inconsistent code slows down reading.

Humans recognize patterns quickly. When code follows the same visual structure everywhere, the brain spends less effort parsing syntax and more effort understanding intent.

Consistent formatting reduces cognitive load.

A shared style is less about aesthetics and more about reducing friction. But how to solve this problem?

Shared Project Style

In IDEs like IntelliJ, you can define a code style and automatically reformat code according to those rules. This helps you keep your own code consistent. However, if every developer uses a different style, it does not help the project as a whole.

You can configure the style under:

Settings -> Editor -> Code Style

and save it as a project-level configuration. IntelliJ will then create a codeStyles folder with XML files inside the .idea directory.

The solution for sharing one configuration across the whole project is to commit these files to Git. This way, every developer working on the project uses the same code style configuration.

The IDE can then help enforce the agreed style by reformatting code before commit or even automatically on save.


Consistency beats preference

The important thing is not finding the perfect style. The important thing is agreeing on one.

A consistent codebase is easier to read, easier to review, and easier to maintain. Pull requests become smaller and cleaner because they contain actual changes instead of formatting noise.

Good formatting should be boring and automatic. That leaves more time for discussions that actually matter.

Consistent Structure Considered Harmful

Debating about pros and cons of different code styles quickly tend to enter “which color is best”-territory, so this is not what I’ll do here, but consider the following an internal debate of mine that occurs from time to time.

The “Structure” of a piece of code spans various topics: from syntactical preferences like conventions for curly braces, to case conventions, naming variables, indentation, line breaks and whitespace in general, how to distribute methods between classes, between files, up to its high-level architecture. I am strictly not talking about the higher levels here.

However, in the finer levels of your code, you will apply a mixture of conventional choices coming from the programming language, the culture surrounding it, and your own background. Some will be shaped by your IDE. Keep in mind that all choices are to be done to aim for one goal:

To produce readable code which is straightforward to argue about.

“Readable” is a heavy word as it bears the inseparability of both (a) prerequisites the reader has to fulfill (who are they and why are they even reading my code?) and (b) that getting used to one convention can greatly affect the reading speed more than any in-grained perks of that convention, but nevertheless, there does exist a dimension outside that.

There is some remaining variability in choice, for example, in how to continue the intendation in a multi-line argument list, how to place newlines in chained calls like fluent interfaces / LINQ in .NET / Promises in JS / … or in chained conditionals – the list goes on.

But for example; what is the optimum, e.g. in this Python example

if (self.evaluate_user_input()
and user_role is in (UserRole.Admin, UserRole.ProjectOwner)):
do_stuff()
# vs
if (self.evaluate_user_input() and
user_role is in (UserRole.Admin, UserRole.ProjectOwner)
):
do_stuff()
# vs
if (self.evaluate_user_input()
and user_role is in (UserRole.Admin, UserRole.ProjectOwner)):
do_stuff()
# vs
if (self.evaluate_user_input()
and user_role is in (
UserRole.Admin,
UserRole.ProjectOwner
)):
do_stuff()
# vs ... outsourcing any of that logic into its own place,
# but even that comes with risks of cluttering structure elsewhere.

Another example are braces in any language that uses them, because I’ve encountered a couple of scenarios where the convention would suggest…

if (theThing) {
quiteALot();
ofDifferent();
lines();
} else {
nowSomeCompletely();
differentStuffToDo();
}
# breaking the brace convention, just for that "else",
# conveys a lot purpose to distinguish these branches. for me.
if (theThing) {
quiteALot();
ofDifferent();
lines();
}
else {
nowSomeCompletely();
differentStuffToDo();
}

Line breaks and continuation can be crucial because they influence how far the eyes have to extend to the right (maybe requiring horizontal scrolling, which is a dealbreaker in any measure of quickly-understanding), but whitespace can be beneficial in distinguishing your product from a pile of unicode vomit (which is why gofmt is wrong in believing that inline formulae are always better with any spaces distilled out).

The more I think about it, the less I would agree with anyone convinced that one should just strife towards one certain style and then stick to it. Moreover, the actual content of a line of code can dominate the decision at hand more than a well-meaning thought of “all of these decisions are equal, do not waste any time about them”. The point is, that time saved in reading this can outweigh your time saved in not caring about said reader.

Of course, the title of this post was chosen somewhat demandingly because the actual goal with that is consistent code. The point in such situations being, that uniform guide lines do not automatically lead to consistent expression of intention.

Aim for what your specific piece of code needs to convey, then allow the idea of choosing a style that is not the same choice as for difference pieces of code. Do not overthink it either, but do not think that deviances in uniformity are a code smell.

A more variable, more purposeful coding styles is likely to to conflict when more than one developer is involved (because of the accustomization effect), but treat it like any performance optimization – discuss it when your Merge Review is actually troublesome, with intention, not miles ahead for some hypothetical horror scenario.