Year – Page 4 – Schneide Blog

Calculating the Number of Segments for Accurate Circle Rendering

A common way to draw circles with any kind of vector graphics API is by approximating it with a regular polygon, e.g. as a regular polygon with 32 sides. The problem with this approach is that it might look good in one resolution, but crude in another, as the approximation becomes more visible. So how do you pick the right number of sides $N$ for the job? For that, let’s look at the error that this approximation has.

A whole bunch of math

I define the ‘error’ of the approximation as the maximum difference between the ideal circle shape and the approximation. In other words, it’s the difference of the inner radius and the outer radius of the regular polygon. Conveniently, with a step angle $\alpha=\frac{2\pi}{N}$ the inner radius is just the outer radius $r$ multiplied by the cosine of half of that: $r\times\cos\frac{\alpha}{2}$ . So the error is $r-r\times\cos\frac{\pi}{N}$ . I find it convenient to use relative error $\epsilon$ for the following, and set $r=1$ :

$\epsilon=1-cos\frac{\pi}{N}$

The following plot shows that value for $N$ going from 4 to 256:

Plot showing the relative error numbers of subdivision

As you can see, this looks hyperbolic and the error falls off rather fast with an increasing number of subdivisions. This function lets use figure out the error for a given number of subdivisions, but what we really want is he inverse of that: Which number of subdivisions do we need for the error to be less than a given value. For example, assuming a 1080p screen, and a half-pixel error on a full-size ( $r=540$ ) circle, that means we should aim for a relative error of $0.1\%$ . So we can solve the error equation above for N. Since the number of subdivisions should be an integer, we round it up:

$N=\lceil\frac{\pi}{\arccos{(1-\epsilon)}}\rceil$

So for $0.1\%$ we need only 71 divisions. The following plot shows the number of subdivisions for error values from $0.01\%$ to $1\%$ :

Here are some specific values:

Assuming a fixed half-pixel error, we can plug in $\epsilon=\frac{0.5}{radius}$ to get:

$N=\lceil\frac{\pi}{\arccos{(1-\frac{0.5}{radius})}}\rceil$

The following graph shows that function for radii up to full-size QHD circles:

Give me code

Here’s the corresponding code in C++, if you just want to figure out the number of segments for a given radius:

std::size_t segments_for(float radius, float pixel_error = 0.5f)
{
  auto d = std::acos(1.f - pixel_error / radius);
  return static_cast<std::size_t>(std::ceil(std::numbers::pi / d));
}

What are GIN Indexes in PostgreSQL?

If you’ve worked with PostgreSQL and dealt with things like full-text search, arrays, or JSON data, you might have heard about GIN indexes. But what exactly are they, and why are they useful?

GIN stands for Generalized Inverted Index. Most indexes (like the default B-tree index) work best when there’s one clear value per row – like a number or a name. But sometimes, a single column can hold many values. Think of a column that stores a list of tags, words in a document, or key-value data in JSON. That’s where GIN comes in.

Let’s walk through a few examples to see how GIN indexes work and why they’re helpful.

Full-text search example

Suppose you have a table of articles:

CREATE TABLE articles (
  id    serial PRIMARY KEY,
  title text,
  body  text
);

You want to let users search the content of these articles. PostgreSQL has built-in support for full-text search, which works with a special data type called tsvector. To get started, you’d add a column to store this processed version of your article text:

ALTER TABLE articles ADD COLUMN tsv tsvector;
UPDATE articles SET tsv = to_tsvector('english', body);

Now, to speed up searches, you create a GIN index:

CREATE INDEX idx_articles_tsv ON articles USING GIN(tsv);

With that in place, you can search for articles quickly:

SELECT * FROM articles WHERE tsv @@ to_tsquery('tonic & water');

This finds all articles that contain both “tonic” and “water”, and thanks to the GIN index, it’s fast – even if you have thousands of articles.

Array example

GIN is also great for columns that store arrays. Let’s say you have a table of photos, and each photo can have several tags:

CREATE TABLE photos (
  id   serial PRIMARY KEY,
  tags text[]
);

You want to find all photos tagged with “capybara”. You can create a GIN index on the tags column:

CREATE INDEX idx_photos_tags ON photos USING GIN(tags);
SELECT * FROM photos WHERE tags @> ARRAY['capybara'];

(The @> operator means “contains” or “is a superset of”.)

The index lets PostgreSQL find matching rows quickly, without scanning the entire table.

JSONB example

PostgreSQL’s jsonb type lets you store flexible key-value data. Imagine a table of users with extra info stored in a jsonb column:

CREATE TABLE users (
  id   serial PRIMARY KEY,
  data jsonb
);

One row might store {"age": 42, "city": "Karlsruhe"}. To find all users from New York, you can use:

SELECT * FROM users WHERE data @> '{"city": "Karlsruhe"}';

And again, with a GIN index on the data column, this query becomes much faster:

CREATE INDEX idx_users_data ON users USING GIN(data);

Things to keep in mind

GIN indexes are very powerful, but they come with some tradeoffs. They’re slower to build and can make insert or update operations a bit heavier. So they’re best when you read (search) data often, but don’t write to the table constantly.

In short, GIN indexes are your friend when you’re dealing with columns that contain multiple values – like arrays, full-text data, or JSON. They let PostgreSQL break apart those values and build a fast lookup system. If your queries feel slow and you’re working with these kinds of columns, adding a GIN index might be exactly what you need.

Expose your app/API with zrok during development

Nowadays many of us are developing libraries, tools and applications somehow connected to the web. Often we provide APIs over HTTP(S) for frontends or other services or develop web apps using such services or backends.

As browsers become more and more picky HTTP is pretty much dead but for developers it is extremely convenient to avoid the hassle of certificates, keystores etc.

Luckily, there is a simple and free tool, that can help in several development scenarios: zrok.io

My most common ones are:

Allowing customers easy (temporary) access to your app in development
Developing SSO and other integrations that need publicly visible HTTPS endpoints
Collaborating with your distributed colleagues and allowing them to develop against your latest build on your machine

What is zrok?

For our use cases think of it as an simple, ad-hoc HTTPS-proxy transport-securing your services and exposing them publicly. For the other features and technical zero trust networking platform explanation head over to their site.

How to use zrok?

You only need a few steps to get zrok up and running. Even though their quick start explains the most important steps I will mention them here too:

Create an account to use the NetFoundry’s public zrok instance and obtain a token from there
Download and install the binary for your platform
Enable your local environment using your token with zrok enable <your_token>

After these steps your are ready to go and may share your local service running on http://localhost:8080 using zrok share public 8080.

Some practical advice and examples

If you want a stable URL for your service, use a reserved share instead of the default temporary one:

.\zrok.exe reserve public http://localhost:5000 --unique-name "mydevinstance"
.\zrok.exe share reserved mydevinstance

That way you get a stable endpoint over restarts which greatly reduces configuration burden in external services or communication with customers or colleagues. You can manage your shares on multiple machines online on https://api-v1.zrok.io:

Your service is then accessible under https://mydevinstance.share.zrok.io/ and you may monitor accesses in the terminal or on the webpage above.

That enables you to use your local service for development against other services, like OAuth or OpenID single-sign-on (SSO), here with ORCID:

Conclusion

Using zrok developers may continue to ignore HTTPS for their local development instances while still being able to expose them privately or publicly including transparent SSL support.

That way you can integrate easily with other services expecting secured public endpoint or collaborate with others transparently without VPNs, tunnels or other means.

How to improve this() by using super()

I have a particular programming style regarding constructors in Java that often sparks curiosity and discussion. In this blog post, I want to note my part in these discussions down.

Let’s start with the simplest example possible: A class without anything. Let’s call it a thing:

public class Thing {
}

There is not much you can do with this Thing. You can instantiate it and then call methods that are present for every Object in Java:

Thing mine = new Thing();
System.out.println(
    mine.hashCode()
);

This code tells us at least two things about the Thing class that aren’t immediately apparent:

It inherits methods from the Object class; therefore, it extends Object.
It has a constructor without any parameters, the “default constructor”.

If we were forced to write those two things in code, our class would look like this:

public class Thing extends Object {
    
    public Thing() {
        super();
    }
}

That’s a lot of noise for essentially no signal/information. But I adopted one rule from it:

Rule 1: Every production class has at least one constructor explicitly written in code.

For me, this is the textual anchor to navigate my code. Because it is the only constructor (so far), every instantiation of the class needs to call it. If I use “Callers” in my IDE on it, I see all clients that use the class by name.

Every IDE has a workaround to see the callers of the constructor(s) without pointing at some piece of code. If you are familiar with such a feature, you might use it in favor of writing explicit constructors. But every IDE works out of the box with the explicit constructor, and that’s what I chose.

There are some exceptions to Rule 1:

Test classes aren’t instantiated directly, so they don’t benefit from a constructor. See also https://schneide.blog/2024/09/30/every-unit-test-is-a-stage-play-part-iii/ for a reasoning why my test classes don’t have explicit constructors.
Record classes are syntactic sugar that don’t benefit from an explicit constructor that replaces the generated one. In fact, record classes use much of their appeal once you write constructors for them.
Anonymous inner types are oftentimes used in one place exclusively. If I need to see all their clients by using the IDE, my code is in a very problematic state, and an explicit constructor won’t help.

One thing that Rule 1 doesn’t cover is the first line of each constructor:

Rule 2: The first line of each constructor contains either a super() or a this() call.

The no-parameters call to the constructor of the superclass is done regardless of my code, but I prefer to see it in code. This is a visual cue to check Rule 3 without much effort:

Rule 3: Each class has only one constructor calling super().

If you incorporate Rule 3 into your code, the instantiation process of your objects gets much cleaner and free from duplication. It means that if you only exhibit one constructor, it calls super() – with or without parameters. If you provide more than one constructor, they form a hierarchy: One constructor is the “main” or “core” constructor. It is the one that calls super(). All the other constructors are “secondary” or “intermediate” constructors. They use this() to call the main constructor or another secondary constructor that is an intermediate step towards the main constructor.

If you visualize this construct, it forms a funnel that directs all constructor calls into the main constructor. By listing its callers, you can see all clients of your class, even those that use secondary constructors. As soon as you have two super() calls in your class, you have two separate ways to construct objects from it. I came to find this possibility way more harmful than useful. There are usually better ways to solve the client’s problem with object instantiation than to introduce a major source of current or future duplication (and the divergent change code smell). If you are interested in some of them, leave a comment, and I will write a blog entry explaining some of them.

Back to the funnel:

if you don’t see the funnel yet, let me abstract the situation a bit more:

This is how it looks in source code:

public class Thing {
    
    private final String name;
    
    public Thing(int serialNumber) {
        this(
            "S/N " + serialNumber
        );
    }
    
    public Thing(String name) {
        super();
        this.name = name;
    }
}

I find this structure very helpful to navigate complex object construction code. But I also have a heuristic that the number of secondary constructors (by visually counting the this() calls) is proportional to the amount of head scratching and resistance to change that the class will induce.

As always, there are exceptions to the rule:

Some classes are just “more specific names” for the same concept. Custom exception types come to mind (see the code example below). It is ok to have several super() calls in these classes, as long as they are clearly free from additional complexity.
Enum types cannot have the super() call in the main constructor. I don’t write a comment as a placeholder; I trust that enum types are low-complexity classes with only a few private constructors and no shenanigans.

This is an example of a multi-super-call class:

public class BadRequest extends IOException {

    public BadRequest(String message, Throwable cause) {
        super(message, cause);
    }

    public BadRequest(String message) {
        super(message);
    }
}

It clearly does nothing more than represent a more specific IOException. There won’t be many reasons to change or even just look at this code.

I might implement a variation to my Rule 2 in the future, starting with Java 22: https://openjdk.org/jeps/447. I’m looking forward to incorporating the new possibilities into my habits!

As you’ve seen, my constructor code style tries to facilitate two things:

Navigation in the project code, with anchor points for IDE functionality.
Orientation in the class code with a standard structure for easier mental mapping.

It introduces boilerplate or cruft code, but only a low amount at specific places. This is the trade-off I’m willing to make.

What are your ideas about this? Leave us a comment!

Tell different stories within the same universe

You might know this from fantasy book series: the author creates a unique world, a whole universe of their own and sets a story or series of books within it. Then, a few years later, a new series is released. It is set in the same universe, but at a different time, with different characters, and tells a completely new story. Still, it builds on the foundation of that original world. The author does not reinvent everything from scratch. They use the same map, the same creatures, the same customs and rules established in the earlier books.

Examples of this are the Harry Potter series and Fantastic Beasts, or The Lord of the Rings and The Hobbit.

But what does this have to do with software development?
In one of my projects, I faced a very similar use case. I had to implement several services, each covering a different use case, but all sharing the same set of peripherals, adapters, and domain types.

So I needed an architecture that did not just allow for interchangeable periphery, as is usually the focus, but also supported interchangeable use cases. In other words, I needed a setup that allowed for multiple “books” to be written within the same “universe.”

Architecture

Let’s start with a simple example: user management.
I originally implemented it following Clean Architecture principles, where the structure resembles an onion, dependencies flow inward, from the outer layers to the core domain logic. This makes the outer layers (the “peel”) easily replaceable or extendable.

Our initial use case is a service that creates a user. The use case defines an interface that the user controller implements, meaning the dependency flows from the outer layer (the controller) toward the core. So far, so good.

However, I wanted to evolve the architecture to support multiple use cases. For that, the direct dependency from the UserController to the CreateUser use case had to be removed.

My solution was to introduce a new domain module, a shared foundation that contains all interfaces, data types, and common logic used by both use cases and adapters. I called this module the UseCaseService.

The result is a new architecture diagram:

There is no longer a direct connection between a specific use case and an adapter. Instead, both depend on the shared UseCaseService module. With this setup, I can easily create new use cases that reuse the existing ecosystem without duplicating code or logic.

For example, I could implement another service that retrieves all users whose birthday is today and sends them birthday greetings. (Whether this is GDPR-compliant is another discussion!) But thanks to this architecture, I now have the freedom to implement that use case cleanly and efficiently.

Conclusion

Architecture is a highly individual matter. There is no one-size-fits-all solution that solves every problem or suits every project. Models like Clean Architecture can be helpful guides, but ultimately, you need to define your own architectural requirements and find a solution that meets them. This was a short story of how one such solution came to life based on my own needs.

It is also a small reminder to keep the freedom to think outside the box. Do not be afraid to design an architecture that truly fits you and your project, even if it deviates from the standard models.

Forking an Open Source Repository in Good Faith

One might love Open Source for different reasons: Maybe as a philosophical concept of transcendental sharing and human progress, maybe for reasons of transparency and security, maybe for the sole reason of getting stuff for free…

But, as a developer, Open Source is additionally appealing for the sake of actively participating, learning and sharing on a directly personal level.

Now I would guess that most repository forks are probably done for rather practical reasons (“I wanna have that!”), the forks get some minor patches one happens to need right now – or for some super-specific use case – and then hang around for some time until that use case vanishes or the changes are so vast that there will never be a merge (a situation commonly known as “der Zug ist abgefahren”), one might sometimes try to supply one’s work for the good of more than oneself. That is what I hereby declare a “Fork in Good Faith.”

A fork can happen in good faith if some conditions are true, like:

I am sure that someone else can benefit from my work
My technical skills match the technical level of the repository in question
Said upstream repository is even open for contributions (i.e. not understaffed)
My broader vision does not diverge from the original maintainers’ vision

Maybe there are more of these, but the most essential point is a mindset:

I declare to myself that I want to stay compatible with the upstream as long as is possible from both sides.

To fork with Good Faith is, then, a great idea because it helps to advance much more causes at once than just the stuff for free, i.e. on a developmental level:

You learn from the existing code, i.e. the language, coding style, design patterns, specific solutions, algorithms, hidden gems, …
You learn from the existing repository, i.e. how commits and branches are organized, how commit messages are used productively, how to manage patches or changes in general, …
In reverse, the original maintainers might learn from you, or at least future contributors might
You might get more people to actually see / try / use your awesome feature, thus getting more feedback or bug reports than brewing your own soup
You might consider it as a workout of professional confidence, to advocate your use cases or implementation decisions against other developers, training to focus on rational principles and unlearning the reflexes of your ego.
This can also serve as a workout in mental fluidity, by changing between different coding styles or conventions – if you are e.g. used of your super-perfect-one-and-only way of doing things, it might just positively blow your mind to see that other conventions can work too, if done properly.
Having someone to actually review your changes in a public pull request (merge request) gives you feedback also on an organisational level, as in “was all of this part actually important for your feature?”, “can you put that into a future pull request?” or “why did you rewrite all comments for some Paleo-Siberian language??”

Not to forget, you might grow your personal or professional network to some degree, or at least get the occasional thank you from anyone (well…).

But the basic point of this post is this:

Maintaining a Fork in Good Faith is active, continuous work.

And there is no shame in abandoning that claim, but if you do once, there might be no easy return.

Just think about the pure sadness of features that are sometimes replicated over-and-over again, or get lost over the time;

And just think about how confusing or annoying that already could have been for yourself, e.g. with some multiply-forked npm package or maybe full-fledged end-user projects (… how many forks of e.g. WLED do even exist?).

This is just some reflection of how careful such a decision should be done. Of course, I am writing this because I recently became aware of that point of bifurcation, i.e. not the point where a repository is forked, but the one where all of the advantages mentioned above are weighed against real downsides.

And these might be legitimate, and numerous, too. Just to name a few,

Maybe the existing conventions are just not “done properly”, and following them for the sake of uniformity makes you unproductive over time?
Maybe the original maintainers are just understaffed, non-responsive or do not adhere to a style of communication that works with you?
Maybe most discussions are really just debates of varying opinion (publicly, over the internet – that usually works!) and not vehicles of transcending the personal boundaries of human knowledge after all?
Maybe you are stuck with sub-par legacy code, unable to boy-scout away some technical debt because “that is not the point right now”, or maybe every other day some upstream commit flushes in more freshly baked legacy code?
Maybe no one understands your use case and contrary to the idea mentioned above – in order to get appropriate feedback about your features, and to prove its worth, you need to distribute this independently?
Maybe at one point the maintainers of an upstream repository change, and from now on you have to name your variables in some Paleo-Siberian language?

I guess you get the point by now. There is much energy to be saved by never considering upstream compatibility in the first place, but there is also much potential to be wasted. I have no clear answer – yet – how to draw the line, but maybe you have some insight on that topic, too.

Are there any examples of forks that live on their own, still with the occasional cherry-pick, rebase, merge? Not one comes to my mind.

Experimenting with CMake’s unity builds

CMake has an option, CMAKE_UNITY_BUILD, to automatically turn your builds into unity-builds, which is essentially combining multiple source files into one. This is supposed to make your builds more efficient. You can just enable enable it while executing the configuration step of your CMake builds, so it is really easy to test. It might just work without any problems. Here are some examples with actual numbers of what that does with build times.

Project A

Let us first start with a relatively small project. It is a real project we have been developing, that reads sensor data, transports it over the network and displays it using SDL and Dear ImGui. I’m compiling it with Visual Studio (v17.13.6) in CMake folder mode, using build insights to track the actual time used. For each configuration, I’m doing a clean rebuild 3 times. The steps are the number of build statements that ninja runs.

Unity Build	#Steps	Time 1	Time 2	Time 3
OFF	40	13.3s	13.4s	13.6s
ON	28	10.9s	10.7s	9.7s

That’s a nice, but not massive, speedup of 124,3% for the median times.

Project A*

Project A has a relatively high number of non-compile steps: 1 step is code generation, 6 steps are static library linking, and 7 steps are executable linking. That’s a total of 14 non-compile steps, which are not directly affected by switching to unity builds. 5 of the executables in Project A are non-essential, basically little test programs. So in an effort to decrease the relative number of non-compile steps, I disabled those for the next test. Each of those also came with an additional source file, so the total number of steps decreased by 10. This really only decreased the relative amount of non-compile steps from 35% to 30%, but the numbers changes quite a bit:

Unity Build	#Steps	Time 1	Time 2	Time 3
OFF	30	9.9s	10.0s	9.7s
ON	18	9.0s	8.8s	9.1s

Now the speedup for the median times was only 110%.

Project B

Project B is another real project, but much bigger than Project A, and much slower to compile. It’s a hardware orchestration system with a web interface. As the project size increases, the chance for something breaking when enabling unity builds also increases. In no particular order:

Include guards really have to be there, even if that particular header was not previously included multiple times
Object files will get a lot bigger, requiring /bigobj to be enabled
Globally scoped symbols will name-clash across files. This is especially true for static globals or things in unnamed namespaces, which basically don’t do their job anymore. More subtly, things moved into the global namespace will also clash, such as the classes with the same name moved into the global namespace via using namespace.

In general, that last point will require the most work to resolve. If all fails, you can disable unity build on a target via set_target_properties(the_target PROPERTIES UNITY_BUILD OFF) or even just skip specific files for unity build inclusion via SKIP_UNITY_BUILD_INCLUSION. In Project B, I only had to do this for files generated by CMakeRC. Here are the results:

Unity Build	#Steps	Time 1	Time 2	Time 3
OFF	416	279.4s	279.3s	284,0s
ON	118	73.2s	76.6s	74.5s

That’s a massive speedup of 375%, just for enabling a build-time switch.

When to use this

Once your project has a certain size, I’d say definitely use this on your CI pipeline, especially if you’re not doing incremental builds. It’s not just time, but also energy saved. And faster feedback cycles are always great. Enabling it on developer machines is another matter: it can be quite confusing when the files you’re editing do not correspond to what the build system is building. Also, developers usually do more incremental builds where the advantages are not as high. I’ve also used hybrid approaches where I enable unity builds only for code that doesn’t change that often, and I’m quite satisfied with that. Definitely add an option to turn that off for debugging though. Have you had similar experiences with unity builds? Do tell!

Deferred Constraints in Oracle DB

Foreign key constraints are like rules in your Oracle database that make sure data is linked properly between tables. For example, you can’t add an order for a customer who doesn’t exist – that’s the kind of thing a foreign key will stop. They help enforce data integrity by ensuring that relationships between tables remain consistent. But hidden in the toolbox of Oracle Database is a lesser-known trick: deferred foreign key constraints.

What Are Deferred Constraints?

By default, when you insert or update data that violates a foreign key constraint, Oracle will throw an error immediately. That’s immediate constraint checking.

But with deferred constraints, Oracle lets you temporarily violate a constraint during a transaction – as long as the constraint is satisfied by the time the transaction is committed.

Here’s how you make a foreign key deferrable:

ALTER TABLE orders
  ADD CONSTRAINT fk_orders_customer
  FOREIGN KEY (customer_id)
  REFERENCES customers(customer_id)
  DEFERRABLE INITIALLY DEFERRED;

That last part – DEFERRABLE INITIALLY DEFERRED – is the secret sauce. Now, the constraint check for fk_orders_customer is deferred until the COMMIT.

Use Cases

Let’s look at a few situations where this is really helpful.

One use case are circular references between tables. Say you have two tables: one for employees, one for departments. Each employee belongs to a department. But each department also has a manager – who is an employee. You end up in a “chicken and egg” situation. Which do you insert first? With deferred constraints, it doesn’t matter – you can insert them in any order, and Oracle will only check everything after you’re done.

Another use case is the bulk import of data. If you’re importing a bunch of data (like copying from another system), it can be really hard to insert things in the perfect order to keep all the foreign key rules happy. Deferred constraints let you just insert everything, then validate it all at the end with one COMMIT.

Deferred constraints also help when dealing with temporary incomplete data: Let’s say your application creates a draft invoice before all the customer info is ready. Normally, this would break a foreign key rule. But if the constraint is deferred, Oracle gives you time to finish adding all the pieces before checking.

Caution

Using deferred constraints recklessly can lead to runtime surprises. Imagine writing a huge batch job that appears to work fine… until it crashes at COMMIT with a constraint violation error – rolling back the entire transaction. So only defer constraints when you really need to.

One last tip

If you want to check if a constraint is deferrable in your database you can use the following SQL query:

SELECT constraint_name, deferrable, deferred
  FROM user_constraints
 WHERE table_name='ORDERS';

AI is super-good at guessing, but also totally clueless

AI is still somewhere in its hype phase, maybe towards the end of it. Most of us have used generative AI or use it more or less regularily.

I am experimenting with it every now and then and sometimes even use the output professionally. On the other hand I am not hyped at all. I have mostly mixed feelings (and the bad feelings are not because I fear losing my job…). Let me share my thoughts:

The situation a few years/months ago

Generative AI (or more specifically) chatgpt impressed many people but failed at really simple tasks like

Simple trick questions like “Tom’s father has three children. The first one is called Mark and the second Andrea. What is the name of the third child?”
Counting certain letters in a word

Another problem were massive hallucinations like shown in Kevlin Henney Kotlin Conf talk and completely made up summaries of a childrens book (german) I got myself.

Our current state of generative AI

Granted, nowadays many of these problems are mitigated and the AI became more useful. On the other hand new problems are found quite frequently and then worked around by the engineers. Here are some examples:

I asked chatgpt to list all german cities above 200k citizens. It put out nice tables scraped from wikipedia and other sources, but with a catch: they were all incomplete and the count was clearly wrong. Even after multiple iterations I did not get a correct result. A quick look at the wikipedia page chatgpt used as a source showed a complete picture.
If you ask about socially explosive topics like islamic terrorism, crime and controversial people like Elon Musk and Donald Trump you may get varying, questionable responses.

More often then not I feel disappointed when using generative AI. I have zero-trust in the results and end up checking and judging the output myself. In questions about code and APIs I usually have enough knowledge to judge the output or at least take it as a base for further development.

In the really good “IntelliJ Wizardry with AI Assistant Live” by Heinz Kabutz online course we also explored the possibilities, limits and integration of Jetbrains AI assistant. While it may be useful in some situations and has a great integration into the IDE we were not impressed by its power.

Generating test cases or refactoring existing code varies from good to harmful. Sometimes it can find bugs for you, sometimes it cannot…

Another personal critical view on AI

After all the ups and downs and the development and progress with AI and many thoughts and reflections about it I have discovered something that really bothers me about AI:

AI takes away most of the transparency, determinism and control that we as software developers are used to and sometimes worked hard for.

As developers we strive for understanding what is really happening. Non-determinism is one of our enemies. Obfuscated code, unclear definitions, undefined behaviour – these and many other things make me/us feel uncomfortable.

And somehow, for me AI feels the same:

I change a prompt slightly, and sometimes the result does not change at all while at other times it changes almost completely. Sometimes the results are very helpful, on another occasion they are total crap. In the past maybe they were useless, now the same prompts put out useful information or code.

This is where my bad feelings about AI come from. Someone, at some company trains the AI, engineers rules, defines the training data etc. Everything has an enourmous impact on the behaviour of the AI and the results. Everything stays outside of your influence and control. One day it works as intended, on another not anymore.

Conclusion

I do not know enough about generative AI and the mathematics, science and engineering behind it to accurately judge or predict the possibilities and boundaries for the year to come.

Maybe we will find ways to regain the transparency, to “debug” our models and prompts, to be able to reason about the output and to make generative AI reliable and predictable.

Maybe generative AI will collapse under the piles of crap it uses for training because we do not have powerful enough means of telling it how to separate trustful/truthful information from the rest.

Maybe we will use it as assistants in many areas like coding or evaluating X-ray images to sort out the unobtrusive ones.

What I really doubt at this point is that AI will replace professionals, regardless of the field. It may make them more productive or enable us to build bigger/better/smarter systems.

Right now, generative AI sometimes proves useful but often is absolutely clueless.

Java enum inheritance preferences are weird

Java enums were weird from their introduction in Java 5 in the year 2004. They are implemented by forcing the compiler to generate several methods based on the declaration of fields/constants in the enum class. For example, the static Enum::valueOf(String) method is only present after compilation.

But with the introduction of default methods in Java 8 (published 2014), things got a little bit weirder if you combine interfaces, default methods and enums.

Let’s look at an example:

public interface Person {

  String name();
}

Nothing exciting to see here, just a Person type that can be asked about its name. Let’s add a default implementation that makes clearly no sense at all:

public interface Person {

  default String name() {
    return UUID.randomUUID().toString();
  }
}

If you implement this interface in a class and don’t overwrite the name() method, you are the weird one:

public class ExternalEmployee implements Person {

  public ExternalEmployee() {
    super();
  }
}

We can make your weirdness visible by creating an ExternalEmployee and calling its name() method:

public class Main {

  public static void main(String[] args) {
    ExternalEmployee external = new ExternalEmployee();
    System.out.println(external.name());
  }
}

This main method prints the “name” of your external employee on the console:

1460edf7-04c7-4f59-84dc-7f9b29371419

Are you sure that you hired a human and not some robot?

But what if we are a small startup company with just a few regular employees that can be expressed by a java enum?

public enum Staff implements Person {

  michael,
  bob,
  chris,
  ;
}

You can probably predict what this little main method prints on the console:

public class Main {

  public static void main(String[] args) {
    System.out.println(
      Staff.michael.name()
    );		
  }
}

But, to our surprise, the name() method got overwritten, without us doing or declaring to do so:

michael

We ended up with the “default” generated name() method from the Java enum type. In this case, the code generated by the compiler takes precedence over the default implementation in the interface, which isn’t what we would expect at first glance.

To our grief, we can’t change this behaviour back to a state that we want by overwriting the name() method once more in our Staff class (maybe we want our employees to be named by long numbers!), because the generated name() method is declared final. From the source code of the enum class:

/**
 * @return the name of this enum constant
 */
public final String name() {
  return name;
}

The only way out of this situation is to avoid the names of methods that are generated in an enum type. For the more obscure ordinal(), this might be feasible, but name() is prone for name conflicts (heh!).

While I can change my example to getName() or something, other situations are more delicate, like this Kotlin issue documents: https://youtrack.jetbrains.com/issue/KT-14115/Enum-cant-implement-an-interface-with-method-name

And I’m really a fan of Java’s enum functionality, it has the power to be really useful in a lot of circumstances. But with great weirdness comes great confusion sometimes.

$\epsilon$	$N$
0.01%	223
0.1%	71
0.2%	50
0.4%	36
0.6%	29
0.8%	25
1.0%	23

	Calculation with inf… on The Great Rational Explos…
	Miq on Common SQL Performance Gotchas…
	Common SQL Performan… on Full-text Search with Pos…
	Common SQL Performan… on Understanding, identifying and…
	How to Eat Last – Sc… on The work experience improvemen…