The Dimensions of Navigation in Object-Oriented Code

One powerful aspects of modern software development is how we move through our code. In object-oriented programming (OOP), understanding relationships between classes, interfaces, methods, and tests is important. But it is not just about reading code; it is about navigating it effectively.

This article explores the key movement dimensions that help developers work efficiently within OOP codebases. These dimensions are not specific to any tool but reflect the conceptual paths developers regularly take to understand and evolve code.

1. Hierarchy Navigation: From Parent to Subtype and Back

In object-oriented systems, inheritance and interfaces create hierarchies. One essential navigation dimension allows us to move upward to a superclass or interface, and downward to a subclass or implementing class.

This dimension is valuable because:

  • Moving up let us understand general contracts or abstract logic that governs behavior across many classes.
  • Moving down help us see specific implementations and how abstract behavior is concretely realized.

This help us maintain a clear overview of where we are within the hierarchy.

2. Behavioral Navigation: From Calls to Definitions and Back

Another important movement is between where methods are defined and where they are used. This is less about structure and more about behavior—how the system flows during execution.

Understanding this movement helps developers:

  • Trace logic through the system from the point of use to its implementation.
  • Identify which parts of the system rely on a particular method or class.
  • Assess how a change to a method might ripple through the codebase.

This navigation is useful when debugging, refactoring, or working in unfamiliar code.

3. Validation Navigation: Between Code and its Tests

Writing automated tests is a fundamental part of software development. Tests are more than just safety nets—they also serve as valuable guides for understanding and verifying how code is intended to behave. Navigating between a class and its corresponding test forms another important dimension.

This movement enables developers to:

  • Quickly validate behavior after making changes.
  • Understand how a class is intended to be used by seeing how it is tested.
  • Improve or add new tests based on recent changes.

Tight integration between code and test supports confident and iterative development, especially in test-driven workflows.

4. Utility Navigation: Supporting Movements that Boost Productivity

Beyond the main three dimensions, there are several supporting movements that contribute to developer efficiency:

  • Searching across the codebase to find any occurrence of a class, method, or term.
  • Generating boilerplate code, like constructors or property accessors, to reduce repetitive work.
  • Code formatting and cleanup, which helps maintain consistency and readability.
  • Autocompletion, which reduces cognitive load and accelerates writing.

These actions do not directly reflect code relationships but enhance how smoothly we can move within and around the code, keeping us focused on solving problems rather than managing structure.

Conclusion: Movement is Understanding

In object-oriented systems, navigating through your codebase along different dimensions provides essential insight for understanding, debugging, and improving your software.

Mastering these dimensions transforms your workflow from reactive to intuitive, allowing you to see code not just as static text, but as a living system you can navigate, shape, and grow.

In an upcoming post, I will take the movement dimensions discussed here and show how they are practically supported in IDEs like Eclipse and IntelliJ IDEA.

Customizing Vite plugins is as quick as Vite itself

Once in a blue moon or so, it might be worthwhile to look at the less frequently used features of the Vite bundler, just to know how your life could be made easier when writing web applications.

And there are real use cases to think about custom vite plugins, e.g.

  1. wanting to treat SVGs as <svg> code or React Component, not just as a <img> resource – so maye your customer can easily swap them out as separate files. That is a solved problem with existing plugins like vite-plugin-svgr or vite-svg-loader, but… once in a ultra-violet moon or so… even the existing plugins might not suffice.
  2. For teaching a few lectures about WebGL / GLSL shader programming, I wanted to readily show the results of changes in a fragment shader on the fly. That is the case in point:

I figured I could just use Vite’s Hot Module Replacement to reload the UI after changing the shader. There could have been alternatives like

  • Copying pieces of GLSL into my JS code as strings
    – which is cumbersome,
  • Using the async import() syntax of JS
    – which is then async, obviously,
  • Employing a working editor component on the UI like Ace / React-Ace
    – which is nice when it works, but is so far off my actual quest, and I guess commonplace IDEs are still more convenient for editing GLSL

I wanted the changes to be quick (like, pair-programming-quick), and Vite’s name exactly means that, and their HMR aptly so. It also gives you the option of raw string assets (import Stuff from "./stuff.txt?raw";) which is ok, but I wanted a bit of prettification to be done automatically. I found vite-plugin-glsl, but I needed it customized because I wanted to always combine multiple blank lines to a single one and this is how easy it was:

  • ./plugin/glslImport.js
    Note: this is executed by the vite dev server, not our JS app itself.
import glsl from "vite-plugin-glsl";

const originalPlugin = glsl({compress: false});

const glslImport = () => ({
    ...originalPlugin,
    enforce: "pre",
    name: "vite-plugin-glsl-custom",
    transform: async (src, id) => {
        const original = await originalPlugin.transform(src, id);
        if (!original) { // not a shader source
            return;
        }
        // custom transformation as described above:
        const code = original.code
            .replace(/\\r\\n/g, "\\n")
            .replace(/\\n(\\n|\\r|\s)*?(?=\s*\w)/g, "\\n$1");
        return {...original, code};
    },
});

export default glslImport;
  • and then the vite.config.js is simply
import { defineConfig } from 'vite';
import glslImport from './plugin/glslImport.js';

export default defineConfig({
    plugins: [
        glslImport()
    ],
    ...
})

I liked that. It kept me from transforming either the raw import or that of the original plugin in each of the 20 different files, and I could easily fiddle around in my filesystem while my students only saw the somewhat-cleaned-up shader code.

So if you would ever have some non-typical-JS files that need some transformation but are e.g. too many or too volatile to be cultivated in their respective source format, that is a nice tool to know. That is as easily-pluggable-into-other-projects as a plugin should be.

Calculating the Number of Segments for Accurate Circle Rendering

A common way to draw circles with any kind of vector graphics API is by approximating it with a regular polygon, e.g. as a regular polygon with 32 sides. The problem with this approach is that it might look good in one resolution, but crude in another, as the approximation becomes more visible. So how do you pick the right number of sides N for the job? For that, let’s look at the error that this approximation has.

A whole bunch of math

I define the ‘error’ of the approximation as the maximum difference between the ideal circle shape and the approximation. In other words, it’s the difference of the inner radius and the outer radius of the regular polygon. Conveniently, with a step angle \alpha=\frac{2\pi}{N} the inner radius is just the outer radius r multiplied by the cosine of half of that: r\times\cos\frac{\alpha}{2}. So the error is r-r\times\cos\frac{\pi}{N}. I find it convenient to use relative error \epsilon for the following, and set r=1:

\epsilon=1-cos\frac{\pi}{N}

The following plot shows that value for N going from 4 to 256:

Plot showing the relative error numbers of subdivision

As you can see, this looks hyperbolic and the error falls off rather fast with an increasing number of subdivisions. This function lets use figure out the error for a given number of subdivisions, but what we really want is he inverse of that: Which number of subdivisions do we need for the error to be less than a given value. For example, assuming a 1080p screen, and a half-pixel error on a full-size (r=540) circle, that means we should aim for a relative error of 0.1\%. So we can solve the error equation above for N. Since the number of subdivisions should be an integer, we round it up:

N=\lceil\frac{\pi}{\arccos{(1-\epsilon)}}\rceil

So for 0.1\% we need only 71 divisions. The following plot shows the number of subdivisions for error values from 0.01\% to 1\%:

Here are some specific values:

\epsilonN
0.01%223
0.1%71
0.2%50
0.4%36
0.6%29
0.8%25
1.0%23

Assuming a fixed half-pixel error, we can plug in \epsilon=\frac{0.5}{radius} to get:

N=\lceil\frac{\pi}{\arccos{(1-\frac{0.5}{radius})}}\rceil

The following graph shows that function for radii up to full-size QHD circles:

Give me code

Here’s the corresponding code in C++, if you just want to figure out the number of segments for a given radius:

std::size_t segments_for(float radius, float pixel_error = 0.5f)
{
  auto d = std::acos(1.f - pixel_error / radius);
  return static_cast<std::size_t>(std::ceil(std::numbers::pi / d));
}

What are GIN Indexes in PostgreSQL?

If you’ve worked with PostgreSQL and dealt with things like full-text search, arrays, or JSON data, you might have heard about GIN indexes. But what exactly are they, and why are they useful?

GIN stands for Generalized Inverted Index. Most indexes (like the default B-tree index) work best when there’s one clear value per row – like a number or a name. But sometimes, a single column can hold many values. Think of a column that stores a list of tags, words in a document, or key-value data in JSON. That’s where GIN comes in.

Let’s walk through a few examples to see how GIN indexes work and why they’re helpful.

Full-text search example

Suppose you have a table of articles:

CREATE TABLE articles (
  id    serial PRIMARY KEY,
  title text,
  body  text
);

You want to let users search the content of these articles. PostgreSQL has built-in support for full-text search, which works with a special data type called tsvector. To get started, you’d add a column to store this processed version of your article text:

ALTER TABLE articles ADD COLUMN tsv tsvector;
UPDATE articles SET tsv = to_tsvector('english', body);

Now, to speed up searches, you create a GIN index:

CREATE INDEX idx_articles_tsv ON articles USING GIN(tsv);

With that in place, you can search for articles quickly:

SELECT * FROM articles WHERE tsv @@ to_tsquery('tonic & water');

This finds all articles that contain both “tonic” and “water”, and thanks to the GIN index, it’s fast – even if you have thousands of articles.

Array example

GIN is also great for columns that store arrays. Let’s say you have a table of photos, and each photo can have several tags:

CREATE TABLE photos (
  id   serial PRIMARY KEY,
  tags text[]
);

You want to find all photos tagged with “capybara”. You can create a GIN index on the tags column:

CREATE INDEX idx_photos_tags ON photos USING GIN(tags);
SELECT * FROM photos WHERE tags @> ARRAY['capybara'];

(The @> operator means “contains” or “is a superset of”.)

The index lets PostgreSQL find matching rows quickly, without scanning the entire table.

JSONB example

PostgreSQL’s jsonb type lets you store flexible key-value data. Imagine a table of users with extra info stored in a jsonb column:

CREATE TABLE users (
  id   serial PRIMARY KEY,
  data jsonb
);

One row might store {"age": 42, "city": "Karlsruhe"}. To find all users from New York, you can use:

SELECT * FROM users WHERE data @> '{"city": "Karlsruhe"}';

And again, with a GIN index on the data column, this query becomes much faster:

CREATE INDEX idx_users_data ON users USING GIN(data);

Things to keep in mind

GIN indexes are very powerful, but they come with some tradeoffs. They’re slower to build and can make insert or update operations a bit heavier. So they’re best when you read (search) data often, but don’t write to the table constantly.

In short, GIN indexes are your friend when you’re dealing with columns that contain multiple values – like arrays, full-text data, or JSON. They let PostgreSQL break apart those values and build a fast lookup system. If your queries feel slow and you’re working with these kinds of columns, adding a GIN index might be exactly what you need.

Expose your app/API with zrok during development

Nowadays many of us are developing libraries, tools and applications somehow connected to the web. Often we provide APIs over HTTP(S) for frontends or other services or develop web apps using such services or backends.

As browsers become more and more picky HTTP is pretty much dead but for developers it is extremely convenient to avoid the hassle of certificates, keystores etc.

Luckily, there is a simple and free tool, that can help in several development scenarios: zrok.io

My most common ones are:

  • Allowing customers easy (temporary) access to your app in development
  • Developing SSO and other integrations that need publicly visible HTTPS endpoints
  • Collaborating with your distributed colleagues and allowing them to develop against your latest build on your machine

What is zrok?

For our use cases think of it as an simple, ad-hoc HTTPS-proxy transport-securing your services and exposing them publicly. For the other features and technical zero trust networking platform explanation head over to their site.

How to use zrok?

You only need a few steps to get zrok up and running. Even though their quick start explains the most important steps I will mention them here too:

After these steps your are ready to go and may share your local service running on http://localhost:8080 using zrok share public 8080.

Some practical advice and examples

If you want a stable URL for your service, use a reserved share instead of the default temporary one:

.\zrok.exe reserve public http://localhost:5000 --unique-name "mydevinstance"
.\zrok.exe share reserved mydevinstance

That way you get a stable endpoint over restarts which greatly reduces configuration burden in external services or communication with customers or colleagues. You can manage your shares on multiple machines online on https://api-v1.zrok.io:

Your service is then accessible under https://mydevinstance.share.zrok.io/ and you may monitor accesses in the terminal or on the webpage above.

That enables you to use your local service for development against other services, like OAuth or OpenID single-sign-on (SSO), here with ORCID:

Conclusion

Using zrok developers may continue to ignore HTTPS for their local development instances while still being able to expose them privately or publicly including transparent SSL support.

That way you can integrate easily with other services expecting secured public endpoint or collaborate with others transparently without VPNs, tunnels or other means.

How to improve this() by using super()

I have a particular programming style regarding constructors in Java that often sparks curiosity and discussion. In this blog post, I want to note my part in these discussions down.

Let’s start with the simplest example possible: A class without anything. Let’s call it a thing:

public class Thing {
}

There is not much you can do with this Thing. You can instantiate it and then call methods that are present for every Object in Java:

Thing mine = new Thing();
System.out.println(
    mine.hashCode()
);

This code tells us at least two things about the Thing class that aren’t immediately apparent:

  • It inherits methods from the Object class; therefore, it extends Object.
  • It has a constructor without any parameters, the “default constructor”.

If we were forced to write those two things in code, our class would look like this:

public class Thing extends Object {
    
    public Thing() {
        super();
    }
}

That’s a lot of noise for essentially no signal/information. But I adopted one rule from it:

Rule 1: Every production class has at least one constructor explicitly written in code.

For me, this is the textual anchor to navigate my code. Because it is the only constructor (so far), every instantiation of the class needs to call it. If I use “Callers” in my IDE on it, I see all clients that use the class by name.

Every IDE has a workaround to see the callers of the constructor(s) without pointing at some piece of code. If you are familiar with such a feature, you might use it in favor of writing explicit constructors. But every IDE works out of the box with the explicit constructor, and that’s what I chose.

There are some exceptions to Rule 1:

  • Test classes aren’t instantiated directly, so they don’t benefit from a constructor. See also https://schneide.blog/2024/09/30/every-unit-test-is-a-stage-play-part-iii/ for a reasoning why my test classes don’t have explicit constructors.
  • Record classes are syntactic sugar that don’t benefit from an explicit constructor that replaces the generated one. In fact, record classes use much of their appeal once you write constructors for them.
  • Anonymous inner types are oftentimes used in one place exclusively. If I need to see all their clients by using the IDE, my code is in a very problematic state, and an explicit constructor won’t help.

One thing that Rule 1 doesn’t cover is the first line of each constructor:

Rule 2: The first line of each constructor contains either a super() or a this() call.

The no-parameters call to the constructor of the superclass is done regardless of my code, but I prefer to see it in code. This is a visual cue to check Rule 3 without much effort:

Rule 3: Each class has only one constructor calling super().

If you incorporate Rule 3 into your code, the instantiation process of your objects gets much cleaner and free from duplication. It means that if you only exhibit one constructor, it calls super() – with or without parameters. If you provide more than one constructor, they form a hierarchy: One constructor is the “main” or “core” constructor. It is the one that calls super(). All the other constructors are “secondary” or “intermediate” constructors. They use this() to call the main constructor or another secondary constructor that is an intermediate step towards the main constructor.

If you visualize this construct, it forms a funnel that directs all constructor calls into the main constructor. By listing its callers, you can see all clients of your class, even those that use secondary constructors. As soon as you have two super() calls in your class, you have two separate ways to construct objects from it. I came to find this possibility way more harmful than useful. There are usually better ways to solve the client’s problem with object instantiation than to introduce a major source of current or future duplication (and the divergent change code smell). If you are interested in some of them, leave a comment, and I will write a blog entry explaining some of them.

Back to the funnel:

if you don’t see the funnel yet, let me abstract the situation a bit more:

This is how it looks in source code:

public class Thing {
    
    private final String name;
    
    public Thing(int serialNumber) {
        this(
            "S/N " + serialNumber
        );
    }
    
    public Thing(String name) {
        super();
        this.name = name;
    }
}

I find this structure very helpful to navigate complex object construction code. But I also have a heuristic that the number of secondary constructors (by visually counting the this() calls) is proportional to the amount of head scratching and resistance to change that the class will induce.

As always, there are exceptions to the rule:

  • Some classes are just “more specific names” for the same concept. Custom exception types come to mind (see the code example below). It is ok to have several super() calls in these classes, as long as they are clearly free from additional complexity.
  • Enum types cannot have the super() call in the main constructor. I don’t write a comment as a placeholder; I trust that enum types are low-complexity classes with only a few private constructors and no shenanigans.

This is an example of a multi-super-call class:

public class BadRequest extends IOException {

    public BadRequest(String message, Throwable cause) {
        super(message, cause);
    }

    public BadRequest(String message) {
        super(message);
    }
}

It clearly does nothing more than represent a more specific IOException. There won’t be many reasons to change or even just look at this code.

I might implement a variation to my Rule 2 in the future, starting with Java 22: https://openjdk.org/jeps/447. I’m looking forward to incorporating the new possibilities into my habits!

As you’ve seen, my constructor code style tries to facilitate two things:

  • Navigation in the project code, with anchor points for IDE functionality.
  • Orientation in the class code with a standard structure for easier mental mapping.

It introduces boilerplate or cruft code, but only a low amount at specific places. This is the trade-off I’m willing to make.

What are your ideas about this? Leave us a comment!

Tell different stories within the same universe

You might know this from fantasy book series: the author creates a unique world, a whole universe of their own and sets a story or series of books within it. Then, a few years later, a new series is released. It is set in the same universe, but at a different time, with different characters, and tells a completely new story. Still, it builds on the foundation of that original world. The author does not reinvent everything from scratch. They use the same map, the same creatures, the same customs and rules established in the earlier books.

Examples of this are the Harry Potter series and Fantastic Beasts, or The Lord of the Rings and The Hobbit.

But what does this have to do with software development?
In one of my projects, I faced a very similar use case. I had to implement several services, each covering a different use case, but all sharing the same set of peripherals, adapters, and domain types.

So I needed an architecture that did not just allow for interchangeable periphery, as is usually the focus, but also supported interchangeable use cases. In other words, I needed a setup that allowed for multiple “books” to be written within the same “universe.”

Architecture

Let’s start with a simple example: user management.
I originally implemented it following Clean Architecture principles, where the structure resembles an onion, dependencies flow inward, from the outer layers to the core domain logic. This makes the outer layers (the “peel”) easily replaceable or extendable.

Our initial use case is a service that creates a user. The use case defines an interface that the user controller implements, meaning the dependency flows from the outer layer (the controller) toward the core. So far, so good.

However, I wanted to evolve the architecture to support multiple use cases. For that, the direct dependency from the UserController to the CreateUser use case had to be removed.

My solution was to introduce a new domain module, a shared foundation that contains all interfaces, data types, and common logic used by both use cases and adapters. I called this module the UseCaseService.

The result is a new architecture diagram:

There is no longer a direct connection between a specific use case and an adapter. Instead, both depend on the shared UseCaseService module. With this setup, I can easily create new use cases that reuse the existing ecosystem without duplicating code or logic.

For example, I could implement another service that retrieves all users whose birthday is today and sends them birthday greetings. (Whether this is GDPR-compliant is another discussion!) But thanks to this architecture, I now have the freedom to implement that use case cleanly and efficiently.

Conclusion

Architecture is a highly individual matter. There is no one-size-fits-all solution that solves every problem or suits every project. Models like Clean Architecture can be helpful guides, but ultimately, you need to define your own architectural requirements and find a solution that meets them. This was a short story of how one such solution came to life based on my own needs.

It is also a small reminder to keep the freedom to think outside the box. Do not be afraid to design an architecture that truly fits you and your project, even if it deviates from the standard models.

Forking an Open Source Repository in Good Faith

One might love Open Source for different reasons: Maybe as a philosophical concept of transcendental sharing and human progress, maybe for reasons of transparency and security, maybe for the sole reason of getting stuff for free…

But, as a developer, Open Source is additionally appealing for the sake of actively participating, learning and sharing on a directly personal level.

Now I would guess that most repository forks are probably done for rather practical reasons (“I wanna have that!”), the forks get some minor patches one happens to need right now – or for some super-specific use case – and then hang around for some time until that use case vanishes or the changes are so vast that there will never be a merge (a situation commonly known as “der Zug ist abgefahren”), one might sometimes try to supply one’s work for the good of more than oneself. That is what I hereby declare a “Fork in Good Faith.”

A fork can happen in good faith if some conditions are true, like:

  • I am sure that someone else can benefit from my work
  • My technical skills match the technical level of the repository in question
  • Said upstream repository is even open for contributions (i.e. not understaffed)
  • My broader vision does not diverge from the original maintainers’ vision

Maybe there are more of these, but the most essential point is a mindset:

  • I declare to myself that I want to stay compatible with the upstream as long as is possible from both sides.

To fork with Good Faith is, then, a great idea because it helps to advance much more causes at once than just the stuff for free, i.e. on a developmental level:

  1. You learn from the existing code, i.e. the language, coding style, design patterns, specific solutions, algorithms, hidden gems, …
  2. You learn from the existing repository, i.e. how commits and branches are organized, how commit messages are used productively, how to manage patches or changes in general, …
  3. In reverse, the original maintainers might learn from you, or at least future contributors might
  4. You might get more people to actually see / try / use your awesome feature, thus getting more feedback or bug reports than brewing your own soup
  5. You might consider it as a workout of professional confidence, to advocate your use cases or implementation decisions against other developers, training to focus on rational principles and unlearning the reflexes of your ego.
  6. This can also serve as a workout in mental fluidity, by changing between different coding styles or conventions – if you are e.g. used of your super-perfect-one-and-only way of doing things, it might just positively blow your mind to see that other conventions can work too, if done properly.
  7. Having someone to actually review your changes in a public pull request (merge request) gives you feedback also on an organisational level, as in “was all of this part actually important for your feature?”, “can you put that into a future pull request?” or “why did you rewrite all comments for some Paleo-Siberian language??”

Not to forget, you might grow your personal or professional network to some degree, or at least get the occasional thank you from anyone (well…).

But the basic point of this post is this:

Maintaining a Fork in Good Faith is active, continuous work.

And there is no shame in abandoning that claim, but if you do once, there might be no easy return.

Just think about the pure sadness of features that are sometimes replicated over-and-over again, or get lost over the time;

And just think about how confusing or annoying that already could have been for yourself, e.g. with some multiply-forked npm package or maybe full-fledged end-user projects (… how many forks of e.g. WLED do even exist?).

This is just some reflection of how careful such a decision should be done. Of course, I am writing this because I recently became aware of that point of bifurcation, i.e. not the point where a repository is forked, but the one where all of the advantages mentioned above are weighed against real downsides.

And these might be legitimate, and numerous, too. Just to name a few,

  1. Maybe the existing conventions are just not “done properly”, and following them for the sake of uniformity makes you unproductive over time?
  2. Maybe the original maintainers are just understaffed, non-responsive or do not adhere to a style of communication that works with you?
  3. Maybe most discussions are really just debates of varying opinion (publicly, over the internet – that usually works!) and not vehicles of transcending the personal boundaries of human knowledge after all?
  4. Maybe you are stuck with sub-par legacy code, unable to boy-scout away some technical debt because “that is not the point right now”, or maybe every other day some upstream commit flushes in more freshly baked legacy code?
  5. Maybe no one understands your use case and contrary to the idea mentioned above – in order to get appropriate feedback about your features, and to prove its worth, you need to distribute this independently?
  6. Maybe at one point the maintainers of an upstream repository change, and from now on you have to name your variables in some Paleo-Siberian language?

I guess you get the point by now. There is much energy to be saved by never considering upstream compatibility in the first place, but there is also much potential to be wasted. I have no clear answer – yet – how to draw the line, but maybe you have some insight on that topic, too.

Are there any examples of forks that live on their own, still with the occasional cherry-pick, rebase, merge? Not one comes to my mind.

Experimenting with CMake’s unity builds

CMake has an option, CMAKE_UNITY_BUILD, to automatically turn your builds into unity-builds, which is essentially combining multiple source files into one. This is supposed to make your builds more efficient. You can just enable enable it while executing the configuration step of your CMake builds, so it is really easy to test. It might just work without any problems. Here are some examples with actual numbers of what that does with build times.

Project A

Let us first start with a relatively small project. It is a real project we have been developing, that reads sensor data, transports it over the network and displays it using SDL and Dear ImGui. I’m compiling it with Visual Studio (v17.13.6) in CMake folder mode, using build insights to track the actual time used. For each configuration, I’m doing a clean rebuild 3 times. The steps are the number of build statements that ninja runs.

Unity Build#StepsTime 1Time 2Time 3
OFF4013.3s13.4s13.6s
ON2810.9s10.7s9.7s

That’s a nice, but not massive, speedup of 124,3% for the median times.

Project A*

Project A has a relatively high number of non-compile steps: 1 step is code generation, 6 steps are static library linking, and 7 steps are executable linking. That’s a total of 14 non-compile steps, which are not directly affected by switching to unity builds. 5 of the executables in Project A are non-essential, basically little test programs. So in an effort to decrease the relative number of non-compile steps, I disabled those for the next test. Each of those also came with an additional source file, so the total number of steps decreased by 10. This really only decreased the relative amount of non-compile steps from 35% to 30%, but the numbers changes quite a bit:

Unity Build#StepsTime 1Time 2Time 3
OFF309.9s10.0s9.7s
ON189.0s8.8s9.1s

Now the speedup for the median times was only 110%.

Project B

Project B is another real project, but much bigger than Project A, and much slower to compile. It’s a hardware orchestration system with a web interface. As the project size increases, the chance for something breaking when enabling unity builds also increases. In no particular order:

  • Include guards really have to be there, even if that particular header was not previously included multiple times
  • Object files will get a lot bigger, requiring /bigobj to be enabled
  • Globally scoped symbols will name-clash across files. This is especially true for static globals or things in unnamed namespaces, which basically don’t do their job anymore. More subtly, things moved into the global namespace will also clash, such as the classes with the same name moved into the global namespace via using namespace.

In general, that last point will require the most work to resolve. If all fails, you can disable unity build on a target via set_target_properties(the_target PROPERTIES UNITY_BUILD OFF) or even just skip specific files for unity build inclusion via SKIP_UNITY_BUILD_INCLUSION. In Project B, I only had to do this for files generated by CMakeRC. Here are the results:

Unity Build#StepsTime 1Time 2Time 3
OFF416279.4s279.3s284,0s
ON11873.2s76.6s74.5s

That’s a massive speedup of 375%, just for enabling a build-time switch.

When to use this

Once your project has a certain size, I’d say definitely use this on your CI pipeline, especially if you’re not doing incremental builds. It’s not just time, but also energy saved. And faster feedback cycles are always great. Enabling it on developer machines is another matter: it can be quite confusing when the files you’re editing do not correspond to what the build system is building. Also, developers usually do more incremental builds where the advantages are not as high. I’ve also used hybrid approaches where I enable unity builds only for code that doesn’t change that often, and I’m quite satisfied with that. Definitely add an option to turn that off for debugging though. Have you had similar experiences with unity builds? Do tell!

Deferred Constraints in Oracle DB

Foreign key constraints are like rules in your Oracle database that make sure data is linked properly between tables. For example, you can’t add an order for a customer who doesn’t exist – that’s the kind of thing a foreign key will stop. They help enforce data integrity by ensuring that relationships between tables remain consistent. But hidden in the toolbox of Oracle Database is a lesser-known trick: deferred foreign key constraints.

What Are Deferred Constraints?

By default, when you insert or update data that violates a foreign key constraint, Oracle will throw an error immediately. That’s immediate constraint checking.

But with deferred constraints, Oracle lets you temporarily violate a constraint during a transaction – as long as the constraint is satisfied by the time the transaction is committed.

Here’s how you make a foreign key deferrable:

ALTER TABLE orders
  ADD CONSTRAINT fk_orders_customer
  FOREIGN KEY (customer_id)
  REFERENCES customers(customer_id)
  DEFERRABLE INITIALLY DEFERRED;

That last part – DEFERRABLE INITIALLY DEFERRED – is the secret sauce. Now, the constraint check for fk_orders_customer is deferred until the COMMIT.

Use Cases

Let’s look at a few situations where this is really helpful.

One use case are circular references between tables. Say you have two tables: one for employees, one for departments. Each employee belongs to a department. But each department also has a manager – who is an employee. You end up in a “chicken and egg” situation. Which do you insert first? With deferred constraints, it doesn’t matter – you can insert them in any order, and Oracle will only check everything after you’re done.

Another use case is the bulk import of data. If you’re importing a bunch of data (like copying from another system), it can be really hard to insert things in the perfect order to keep all the foreign key rules happy. Deferred constraints let you just insert everything, then validate it all at the end with one COMMIT.

Deferred constraints also help when dealing with temporary incomplete data: Let’s say your application creates a draft invoice before all the customer info is ready. Normally, this would break a foreign key rule. But if the constraint is deferred, Oracle gives you time to finish adding all the pieces before checking.

Caution

Using deferred constraints recklessly can lead to runtime surprises. Imagine writing a huge batch job that appears to work fine… until it crashes at COMMIT with a constraint violation error – rolling back the entire transaction. So only defer constraints when you really need to.

One last tip

If you want to check if a constraint is deferrable in your database you can use the following SQL query:

SELECT constraint_name, deferrable, deferred
  FROM user_constraints
 WHERE table_name='ORDERS';