Author: Marius Elvert – Page 2

Experimenting with CMake’s unity builds

CMake has an option, CMAKE_UNITY_BUILD, to automatically turn your builds into unity-builds, which is essentially combining multiple source files into one. This is supposed to make your builds more efficient. You can just enable enable it while executing the configuration step of your CMake builds, so it is really easy to test. It might just work without any problems. Here are some examples with actual numbers of what that does with build times.

Project A

Let us first start with a relatively small project. It is a real project we have been developing, that reads sensor data, transports it over the network and displays it using SDL and Dear ImGui. I’m compiling it with Visual Studio (v17.13.6) in CMake folder mode, using build insights to track the actual time used. For each configuration, I’m doing a clean rebuild 3 times. The steps are the number of build statements that ninja runs.

Unity Build	#Steps	Time 1	Time 2	Time 3
OFF	40	13.3s	13.4s	13.6s
ON	28	10.9s	10.7s	9.7s

That’s a nice, but not massive, speedup of 124,3% for the median times.

Project A*

Project A has a relatively high number of non-compile steps: 1 step is code generation, 6 steps are static library linking, and 7 steps are executable linking. That’s a total of 14 non-compile steps, which are not directly affected by switching to unity builds. 5 of the executables in Project A are non-essential, basically little test programs. So in an effort to decrease the relative number of non-compile steps, I disabled those for the next test. Each of those also came with an additional source file, so the total number of steps decreased by 10. This really only decreased the relative amount of non-compile steps from 35% to 30%, but the numbers changes quite a bit:

Unity Build	#Steps	Time 1	Time 2	Time 3
OFF	30	9.9s	10.0s	9.7s
ON	18	9.0s	8.8s	9.1s

Now the speedup for the median times was only 110%.

Project B

Project B is another real project, but much bigger than Project A, and much slower to compile. It’s a hardware orchestration system with a web interface. As the project size increases, the chance for something breaking when enabling unity builds also increases. In no particular order:

Include guards really have to be there, even if that particular header was not previously included multiple times
Object files will get a lot bigger, requiring /bigobj to be enabled
Globally scoped symbols will name-clash across files. This is especially true for static globals or things in unnamed namespaces, which basically don’t do their job anymore. More subtly, things moved into the global namespace will also clash, such as the classes with the same name moved into the global namespace via using namespace.

In general, that last point will require the most work to resolve. If all fails, you can disable unity build on a target via set_target_properties(the_target PROPERTIES UNITY_BUILD OFF) or even just skip specific files for unity build inclusion via SKIP_UNITY_BUILD_INCLUSION. In Project B, I only had to do this for files generated by CMakeRC. Here are the results:

Unity Build	#Steps	Time 1	Time 2	Time 3
OFF	416	279.4s	279.3s	284,0s
ON	118	73.2s	76.6s	74.5s

That’s a massive speedup of 375%, just for enabling a build-time switch.

When to use this

Once your project has a certain size, I’d say definitely use this on your CI pipeline, especially if you’re not doing incremental builds. It’s not just time, but also energy saved. And faster feedback cycles are always great. Enabling it on developer machines is another matter: it can be quite confusing when the files you’re editing do not correspond to what the build system is building. Also, developers usually do more incremental builds where the advantages are not as high. I’ve also used hybrid approaches where I enable unity builds only for code that doesn’t change that often, and I’m quite satisfied with that. Definitely add an option to turn that off for debugging though. Have you had similar experiences with unity builds? Do tell!

You are mislead about the Big-O notation

One statement I have people say and people repeat a lot, especially in the data-oriented design bubble, is that Big-O notation cannot accurately real-life performance of contemporary computer programs, especially in the presence of multi-tier memory hierarchies like L1/L2/L3-caches for RAM. This is, at best, misleading and gives this fantastic tool a bad reputation.

At it’s core, Big-O is just a way to categorize functions in how they scale. There’s nothing in the formal definition about performance at all. Of course, it is often used to categorize performance of algorithms and implementations of them. But to use it for that, you need two other things: A machine model and a metric for it.

Traditionally, when performance categorization using Big-O is taught, the machine model is either the Turing-machine or the slightly closer-to-reality RAM-machine. The metric is a number of operations. The operation that is counted has a huge impact. For example, insertion sort can easily be implemented in O(n*log(n)) when counting the number of comparisons (by using binary search to find the insertion point), but is in O(n²) when counting the number of memory moves/swaps.

Neither the model nor the metric is intrinsic to Big-O. To use in in the context of memory hierarchies, you just need to start counting what matters to you, e.g. memory accesses, cache misses or branch mispredictions. This is not new either, I learned about cache-aware and cache-oblivious machine models for this in university over 15 years ago.

TL;DR: Big-O is not obsolete, you just have to use it to count the appropriate performance-critical element in your algorithm.

Beware of using Git LFS on Github

In my private game programming projects, I am often using data files alongside my code for all kinds of game assets like images and sounds. So I thought it might be a good idea to use the Git Large File Storage (=LFS) extension for that.

What is Git LFS?

Essentially, if you’re not using it, the file will be in your local .git folder if it was part of your repository at any time in your history. E.g. if you accidentally added&committed a 800mb video files and then deleted it again, they will still be in your local .git folder. This problem multiplies when using a CI with many branches: each branch will typically have a copy of all files ever used in your repository. This is not a problem with source code files, because they are not that big and they can be compressed really well with different versions of themselves, which is what git typically does.

With Git LFS, the big files are only stored as references in the .git folder. This means that you might need an additional request to your remote when checking them out again, but it will save you lots space and traffic when cloning repositories.

In my previous projects on github, I just did not enable LFS for my assets. And that worked fine, as my assets are usually pretty small and I don’t change them often. But this time I wanted to try it.

Sorry, Github, what?

Imagine my suprise when I got an e-mail from github last month warning me that my LFS traffic quota is almost reached and I have to pay to extend it. What? I never had and traffic quota problems without LFS. Github doesn’t even seem to have one, if I just keep my big files in ‘pure’ git. So that’s what I get for trying to safe Github traffic.

Now the LFS quota is a meager 1 gb per month with Github Pro. That’s nothing. Luckily, my current project is not asset heavy: the full repo is very small at ~60mb. But still the quota was reached with me as a single developer. How did that happen? I just enabled CI for my project on my home server and I was creating lots of branches my CI wanted to build. That’s only 12 branches cloned for the 80% warning to be reached.

Workarounds

Jenkins, which I’m using as a CI tool, has the ability to use a ‘reference repository’ when cloning. This can be used to get the bulk of the data from a local remote, while getting the rest from Github. This is what I’m now using to avoid excess LFS traffic. It is a bit of a pain to set up: you have to manually maintain this reference repository, Jenkins will not do it for you, and you have to do that on each agent. I only have one at this point, so that’s an okay trade-off. But next time, Isure won’t use Git LFS on Github, if I can avoid it.

Heterogeneous lookup in unordered C++ containers

I often use std::string_view via the sv suffix for string constants in my code. If I need to associate something with those constants at runtime, I put it in an std::unordered_map with the constants as the keys.

Just a few days ago, I was using and std::unordered_map<std::string, ...> and wanted to .find(...) something in it with such a string constant. But that didn’t compile. From long ago, I remember that the type must be identical, and since there is no implicit conversion from std::string_view to std::string, I made that explicit to get it to compile. But wait. Didn’t C++ add support for using a different type than the key_type for the lookup? Indeed it did, in P0919R3 and P1690R1 from last decade. All major compilers seem to support it too. Then why wasn’t this working? It turns out that it’s not enabled by default, you need to explicitly enable it by supplying a special hasher. Here’s how I do it:

struct stringly_hash
{
  using is_transparent = void;
  [[nodiscard]] size_t operator()(char const* rhs) const
  {
    return std::hash<std::string_view>{}(rhs);
  }
  [[nodiscard]] size_t operator()(std::string_view rhs) const
  {
    return std::hash<std::string_view>{}(rhs);
  }
  [[nodiscard]] size_t operator()(std::string const& rhs) const
  {
    return std::hash<std::string>{}(rhs);
  }
};

template <typename ValueType>
using unordered_string_map = std::unordered_map<
  std::string,
  ValueType,
  stringly_hash,
  std::equal_to<>
>;

This is almost the same code as the sample given in the first of the two proposals. The using is_transparent = void; is how the feature is enabled and was changed in the second proposal.

I have changed my stance on “using” in C++ headers

I used to be pretty strictly against using either C++ using-directives or -declarations from within header files. It kind of stuck with me as a no-go. But that has changed in recent years.

There are now good cases where using can go into a header. For example, I do not really like putting things like…

using namespace std::string_literals;
using namespace std::string_view_literals;
using namespace std::chrono_literals;

…at the beginning of each source file. Did you know that you can pull all those (and some more) in with a single using namespace std::literals? Either way, in my newer projects, these usually go into one of the more prominent headers. Same goes for other literal operators such as those from the SI library. And so do using declarations for common vocabulary types. E.g. 2D or 3D vector types , in math heavy projects. Of course, they always go after the specific #include(s) the using is referencing. The benefits of doing that usually outweigh the danger of name-clashes and weird order dependencies.

There are cases where I still avoid using in headers however, and that is when the given header is ‘public’, i.e. being consumed by something that is not under my organization’s control. In that case, you better leave that decision to the library consumer.

Efficient integer powers of floating-point numbers in C++

Given a floating-point number x, it is quite easy to square it: x = x * x;, or x *= x;. Similarly, to find its cube, you can use x = x * x * x;.

However, when raising it to the 4’th power, things get more interesting: There’s the naive way: x = x * x * x * x;. And the slightly obscure way x *= x; x *= x; which saves a multiplication.

When raining to the 8’th power, the naive way really loses its appeal: x = x * x * x * x * x * x * x * x; versus x *= x; x *= x; x *= x;, that’s 7 multiplications version just 3. This process can easily be extended for raising a number to any power-of-two N, and will only use O(log(n)) multiplications.

The algorithm can also easily be extended to work with any integer power. This works by decomposing the number into product of power-of-twos. Luckily, that’s exactly what the binary representation so readily available on any computer is. For example, let us try x to the power of 20. That’s 16+4, i.e. 10100 in binary.

x *= x; // x is the original x^2 after this
x *= x; // x is the original x^4 after this
result = x;
x *= x; // x is the original x^8 after this
x *= x; // x is the original x^16 after this
result *= x;

Now let us throw this into some C++ code, with the power being a constant. That way, the optimizer can take out all the loops and generate just the optimal sequence of multiplications when the power is known at compile time.

template <unsigned int y> float nth_power(float x)
{
  auto p = y;
  auto result = ((p & 1) != 0) ? x : 1.f;
  while(p > 0)
  {
    x *= x;
    p = p >> 1;
    if ((p & 1) != 0)
      result *= x;
  }

  return result;
}

Interestingly, the big compilers do a very different job optimizing this. GCC optimizes out the loops with -O2 exactly up to nth_power<15>, but continues to do so with -O3 on higher powers. clang reliably takes out the loops even with just -O2. MSVC doesn’t seem to eliminate the loops at all, nor does it remove the multiplication with 1.f if the lowest bit is not set. Let me know if you find an implementation that MSVC can optimize! All tested on the compiler explorer godbolt.org.

My conan 2 Consumer Workflow

A great many things changed in the transition from conan 1.x to conan 2.x. For me, as an application-developer first, the main thing was how I consume packages. The two IDEs I use the most in C++ are Visual Studio and CLion, so I needed a good workflow with those. For Visual Studio, I am using its CMake integration, otherwise known as “folder mode”, which lets you directly open a project with a CMakeLists.txt file in it, instead of generating a solution and opening that. The deciding factor for me is that that uses Ninja as a build tool instead of MSBuild, which often is a lot faster. I have had projects with 3.5x build-time speed ups. As an added bonus, CLion supports very much the same workflow, which reduces friction when switching between platforms.

Visual Studio

First, we’re going to need some local profiles. I typically treat them as ‘build configurations’, with one profile for debug and release on each platform. I put them under version control with the project. A good starting point to create them is conan profile detect, which guesses your environment. To create a profile to a file, go to your project folder and use something like:

conan profile detect --name ./windows_release

Note the ./ in the name, which will instruct conan to create a profile in the current working directory instead of in your user settings. For me, this generates the following profile:

[settings]
arch=x86_64
build_type=Release
compiler=msvc
compiler.cppstd=14
compiler.runtime=dynamic
compiler.version=194
os=Windows

Conan will warn you, that this is only a guess and you should make sure that the values work for you. I usually bump up the compiler.cppstd to at least 20, but the really important change is to change the CMake generator to Ninja, after which the profile should look something like this:

[settings]
arch=x86_64
build_type=Release
compiler=msvc
compiler.cppstd=20
compiler.runtime=dynamic
compiler.version=194
os=Windows

[conf]
tools.cmake.cmaketoolchain:generator=Ninja

Copy and edit the build_type to create a corresponding profile for debug builds.

While conanfile.txt still works for specifying your dependencies, I now recommend directly using conanfile.py from the get go, as some options like overriding dependencies are now exclusive to it. Here’s an example installing the popular logging library spdlog:

from conan import ConanFile
from conan.tools.cmake import cmake_layout


class ProjectRecipe(ConanFile):
    settings = "os", "compiler", "build_type", "arch"
    generators = "CMakeToolchain", "CMakeDeps"

    def requirements(self):
        self.requires("spdlog/1.14.1")

    def layout(self):
        cmake_layout(self)

Note that I am using cmake_layout to setup the folder structure, which will make conan put the files it generates in build/Release for the windows_release profile we created.

Now it is time to install the dependencies using conan install. Make sure you have a clean project before this, e.g. there are no other build/config folders like build/, out/ and .vs/. Specifically, do not open the project in Visual Studio before doing that, as it will create another build setup. You already need the CMakeLists.txt at this point, but it can be empty. For completeness, here’s one that works with the conanfile.py from above:

cmake_minimum_required(VERSION 3.28)
project(ConanExample)

find_package(spdlog CONFIG REQUIRED)

add_executable(conan_example
  main.cpp
)

target_link_libraries(conan_example
  spdlog::spdlog
)

Run this in your project folder:

conan install . -pr:a ./windows_release

This will install the dependencies and even tell you what to put in your CMakeLists.txt to use them. More importantly for the Visual Studio integration, it will create a CMakeUserPresets.json file that will allow Visual Studio to find the prepared build folder once you open the project. If there is no CMakeLists.txt when you call conan install, this file will not be created! Note that you generally do not want this file under version control.

Now that this is setup, you can finally open the project in Visual Studio. You should see a configuration named “conan-release” already available and CMake should run without errors. After this point, you can let conan add new configurations and Visual Studio should automatically pick them up through the CMake user presets.

CLion

The process is essentially the same for CLion, except that the profile will probably look different, depending on the platform. Switching the generator to Ninja is not as essential, but I still like to do it for the speed advantages.

Again, make sure you let conan setup the initial build folders and CMakeUserPresets.json and not the IDE. CLion will then pick them up and work with them like Visual Studio does.

Additional thoughts

I like to create additional script files that I use to setup/update the dependencies. For example, in windows, I create a conan_install.bat file like this:

@echo Installing debug dependencies
conan install . -pr:a conan/windows_debug --build=missing %*
@if %errorlevel% neq 0 exit /b %errorlevel%

@echo Installing release dependencies
conan install . -pr:a conan/windows_release --build=missing %*
@if %errorlevel% neq 0 exit /b %errorlevel%

Have you used other workflows successfully in these or different environments? Let me know about them!

Simple marching squares in C++

Marching squares is an algorithm to find the contour of a scalar field. For example, that can be a height-map and the resulting contour would be lines of a specific height known as ‘isolines’.

At the core of the algorithm is a lookup table that says which line segments to generate for a specific ’tile’ configuration. To make sense of that, you start with a convention on how your tile configuration and the resulting lines are encoded. I typically add a small piece of ASCII-art to explain that:

// c3-e3-c2
// |      |
// e0    e2
// |      |
// c0-e1-c1
//
// c are corner bits, e the edge indices

The input of our lookup table is a bitmask of which of the corners c are ‘in’ or above our isolevel. The output is which tile edges e to connect with line segments. That is either 0, 1 or 2 line segments, so we need to encode that many pairs. You could easily pack that into a 32-bit, but I am using a std::vector<std::uint8_t> for simplicity. Here’s the whole thing:

using config = std::vector<std::uint8_t>;
using config_lookup = std::array<config, 16>;
const config_lookup LOOKUP{
  config{},
  { 0, 1 },
  { 1, 2 },
  { 0, 2 },
  { 2, 3 },
  { 0, 1, 2, 3 },
  { 1, 3 },
  { 0, 3 },
  { 3, 0 },
  { 3, 1 },
  { 1, 2, 3, 0 },
  { 3, 2 },
  { 2, 0 },
  { 2, 1 },
  { 1, 0 },
  config{},
};

I usually want to generate index meshes, so I can easily connect edges later without comparing the floating-point coordinates. So one design goal here was to generate each point only once. Here is the top-level algorithm:

using point_id = std::tuple<int, int, bool>;

std::vector<v2<float>> points;
// Maps construction parameters to existing entries in points
std::unordered_map<point_id, std::uint16_t, key_hash> point_cache;
// Index pairs for the constructed edges
std::vector<std::uint16_t> edges;

auto [ex, ey] = map.size();
auto hx = ex-1;
auto hy = ey-1;

// Construct inner edges
for (int cy = 0; cy < hy; ++cy)
for (int cx = 0; cx < hx; ++cx)
{
  std::uint32_t key = 0;
  if (map(cx, cy) > threshold)
    key |= 1;
  if (map(cx + 1, cy) > threshold)
    key |= 2;
  if (map(cx + 1, cy + 1) > threshold)
    key |= 4;
  if (map(cx, cy + 1) > threshold)
    key |= 8;

  auto const& geometry = LOOKUP[key];

  for (auto each : geometry)
  {
    auto normalized_id = normalize_point(cx, cy, each);
    auto found = point_cache.find(normalized_id);
    if (found != point_cache.end())
    {
      edges.push_back(found->second);
    }
    else
    {
      auto index = static_cast<std::uint16_t>(points.size());
      points.push_back(build_point(map, threshold, normalized_id));
      edges.push_back(index);
      point_cache.insert({ normalized_id, index });
    }
  }
}

For each tile, we first figure out the lookup input-key by testing the 4 corners. We then get-or-create the global point for each edge point from the lookup.
Since each edge in a tile can be accessed from two sides, we first normalize it to have a unique key for our cache:

point_id normalize_point(int cx, int cy, std::uint8_t edge)
{
  switch (edge)
  {
  case 3:
    return { cx, cy + 1, false };
  case 2:
    return { cx + 1, cy, true };
  default:
    return { cx, cy, edge == 0 };
  };
}

When we need to create a point an edge, we interpolate to estimate where exactly the isoline intersects our tile-edge:

v2<float> build_point(raster_adaptor const& map, float threshold, point_id const& p)
{
  auto [x0, y0, vertical] = p;
  int x1 = x0, y1 = y0;
  if (vertical)
    y1++;
  else
    x1++;

  const auto s = map.scale();
  float h0 = map(x0, y0);
  float h1 = map(x1, y1);
  float lambda = (threshold - h0) / (h1 - h0);

  auto result = v2{ x0 * s, y0 * s };
  auto shift = lambda * s;
  if (vertical)
    result[1] += shift;
  else
    result[0] += shift;
  return result;
}

For a height-map, that’s about as good as you can get.

You can, however, sample other scalar field functions with this as well, for example sums of distances. This is not the most sophisticated implementation of marching squares, but it is reasonably simple and can easily be adapted to your needs.

Four-way Navigation in UIs

Just yesterday, I was working on the task of enabling gamepad navigation of a graphical UI. I had implemented this before in my game abstractanks but since forgotten how exactly I did it. So I opened the old code and tried to decipher it, and I figured that’d make a nice topic to write about.

Basic implementation

Let’s break down the simple version of the problem: You have a bunch of rectangular controls, and given a specific one, figure out the next one with an input of either left, up, right or down.

This sketch shows a control setup with a possible solution. It also contains an interesting situation: going ‘down’ from B box goes to C, but going up from there goes to A!

The key to creating this solution is a metric that weights the gap for a specific input direction, e.g. neighbor_metric( box<> const& from, box<> const& to, navigation_direction direction). To implement this, we need to convert this gap into numbers we can use. I’ve used a variant of Arvo’s algorithm for that: For both axes, get the difference of the rectangles’ intervals along that axis and store those in a 2d-vector. In code:

template <int axis> inline float difference_on_axis(
  box<> const& from, box<> const& to)
{
  if (to.min[axis] > from.max[axis])
    return to.min[axis] - from.max[axis];
  else if (to.max[axis] < from.min[axis])
    return to.max[axis] - from.min[axis];
  return 0.f;
}

v2<> arvo_vector(box<> const& from, box<> const& to)
{
  return { 
    difference_on_axis<0>(from, to),
    difference_on_axis<1>(from, to) };
}

That sketch shows the resulting vectors from the box in the top-left going to two other boxes. Note that these vectors are quite different from the difference of the boxes’ centers. In the case of the two top boxes, the vector connecting the centers would tilt down slightly, while this one is completely parallel to the x axis.

Now armed with this vector, let’s look at the metric I was using. It results in a 2d ‘score’ that is later compared lexicographically to determine the best candidate: the first number determines the ‘angle’ with the selected axis, the other one the distance.

template <int axis> auto metric_on_axis(box<> const& from, box<> const& to)
{
  auto delta = arvo_vector(from, to);
  delta[0] *= delta[0];
  delta[1] *= delta[1];
  auto square_distance = delta[0] + delta[1];

  float cosine_squared = delta[axis] / square_distance;
  return std::make_pair(-cosine_squared, delta[axis]);
}

std::optional<std::pair<float, float>> neighbor_metric(
  box<> const& from, box<> const& to, navigation_direction direction)
{
  switch (direction)
  {
  default:
  case navigation_direction::right:
  {
    if (from.max[0] >= to.max[0])
      return {};
    return metric_on_axis<0>(from, to);
  }
  case navigation_direction::left:
  {
    if (from.min[0] <= to.min[0])
      return {};
    return metric_on_axis<0>(from, to);
  }
  case navigation_direction::up:
  {
    if (from.max[1] >= to.max[1])
      return {};
    return metric_on_axis<1>(from, to);
  }
  case navigation_direction::down:
  {
    if (from.min[1] <= to.min[1])
      return {};
    return metric_on_axis<1>(from, to);
  }
  }
}

In practice this means that the algorithm will favor connections that best align with the input direction, while ties resolved by using the closest candidate. The metric ‘disqualifies’ candidates going backward, e.g. when going right, the next box cannot start left of the from box.

Now we just need to loop through all candidates and the select the one with the lowest metric.

This algorithm does not make any guarantees that all controls will be accessible, but that is a property that can easily be tested by traversing the graph induced by this metric, and the UI can be designed appropriately. It also does not try to be symmetric, e.g. going down then up does not always result in going back to the previous control. As we can see in the first sketch, this is not always desirable. I think it’s nice to be able to go from B to C via ‘down’, but I’d be weird to go ‘up’ back there instead of A. Instead, going ‘right’ to B does make sense.

Hard cases

But there can be ambiguities that this algorithm does not quite solve. Consider the case were C is wider, so that is is also under B:

The algorithm will connect both A and B down to C, but the metric will be tied for A and B going up from C. The metric could be extended to also include the ‘cross’ axis min-point of the box, e.g. favoring left over right for westerners like me. But going from B down to C and then up to A would feel weird. One idea to resolve this is to use the history to break ties, e.g. when coming from B to C, going back up would go back to C.

Another hard case is scroll-views. In fact, they seem to change the problem domain. Instead of treating the inputs as boxes in a flat plane, navigating in a scroll view requires to navigate to potentially only partially visible or even invisible boxes and bringing them into view. I’ve previously solved this by treating every scroll-view as its own separate plane and navigating only within that if possible. Only when no target is found within the scroll-view, did the algorithm try to navigate to items outside.

My Favorite Pattern

It has become somewhat of an internal meme that I do not like it when programmers use the word “wrapper”. When someone does say it, I usually get a cue from one of the others to start complaining about it. Do not get me wrong, though. I am very much in favor of wrapping things, but with purpose. And my favorite one is the façade.

When simple becomes complex

Many times, APIs start out simple and elegant. This usually works for a while and the API gets used a lot precisely because of its beauty and simplicity. But eventually, a new use case comes along that demands more of the API than it can currently serve. It has to be extended. This usually takes the form of an additional method or function parameter, or an additional function that needs to be called. Using the API now becomes more complex all its users.

Do not underestimate this effect. I have only anecdotal evidence, but in my experience, a lot of unnecessary software complexity can be attributed to this¹. The Pareto-Principle applies here: A single use case causes all the users of the previously simple API to deal with new complexity (e.g. 10% of the use cases cause 90% of the complexity in the user-/call-sites).

Façades make it look beautiful

Luckily, it can be dealt with beautifully: using the façade pattern. This pattern abstracts a complex API behind a simple API. The trade-off, of course, is that it is less powerful than the “full API”. In our example though, all of the previous use-cases can keep using the simple API via a façade.

When to apply this

The aforementioned example, extending an API, is a very nice opportunity to apply the façade. Just keep the interface of the old API around, and re-implement it using the new, extended API, which is usually created by modifying the old API’s implementation. Now all the old call-sites can stay the same, yet you can have a more powerful API for those rare cases that need it.

Of course, you can also identify common usage patterns and refactor them using a façade, but that’s usually much harder to do.

What exactly are façades made of?

Façades do not hide the more complex API in the sense that the APIs users are not allowed to use it. Yes, façades make APIs look beautiful, but that is where the metaphor ends. You can still access what is behind the façade. You can even write more façades for the behind. Many APIs have multiple common cases and only very few complex ones.

So… Classes? Functions? Data? Any of those, in fact. Whenever you enable writing something in a simpler way for a common case, you have a façade . Very often, a small function with a simple signature is all the façade you need.

But it makes all the difference.

Now can someone please tell me what that little hook under the c is called?

Façades can, of course, also contribute to creating complexity by growing the codebase and creating ‘variants’. But they rarely do. ↩︎

	Great software engin… on Digitalization is hard (especi…
	Impressions of Our C… on Computing gets fuzzy again (AI…
	Computing gets fuzzy… on Impressions of Our Current AI…
	mariuselvert on Calculating the Number of Segm…
	Anonymous on Calculating the Number of Segm…