C++ header-only libraries are bad

A somewhat more recent trend in the C++ community is the popularity of header-only single-file libraries. Prominent examples are catch2, JSON for Modern C++ and spdlog. These are all great, modern and popular libraries, and I personally enjoy using all of them.

But back to the provoking title. This may be a bit of an over-generalization, and it is meant to be a little bit ambiguous. Mathieu Ropert already pointed out that header-only files are but a symptom of the whole C++ modules and package misery. The aforementioned libraries are all great pieces of software but it is bad that:

  • they are exclusively header-only
  • header-only is seen as a sign of quality these days

Historically, header-only libraries have been a thing in C++ because of templates. Templates are not functions or variables that can be referenced by the linker. No, as the name so fittingly suggests, they are just templates for those, with the potential to become, or better, be instantiated into, something that actually survives the trip to the executable code. Header-only libraries used to be code that could only materialized in the context of other code.

But the focus has shifted to portability. I guess by coincidence, people discovered that header-only libraries are also relatively easy to import into your project.

It is actually about inlining

Splitting code between headers and implementation files is a trade off, one that is often synonymous with marking functions inline or not. Inlining is just one more fine-tuning tool that C++ programmers have at their disposal to make the resulting application behave as they want. Carefully considering whether to inline helps to manage compile times, transitive dependencies and code-bloat.

Even for template-heavy libraries, not all of it has to to be inlined. It is often beneficial for compilation-time, code-size and run-time to use techniques such as thin templates to make sure some of the code is properly insulated.

Another way?

Promoting “header-only” as the new buzzword for portability has the side-effect of implying which code is not marked as inline: None.

That is just ignorant of that dimension of the code. It is equivalent to not making a choice about insulation and inlining.

Sure, header-only is marginally better for dropping into your code, but adding a portable implementation file should be just as easy. Why not deliver portable libraries as a single implementation file and a single header instead? Those could easily be generated by a preprocessing step E.g. catch2’s single-header is generated anyways, so it should not be much harder to split that output into two files. Of course the implementation file should be able to work within your compilation environment. But the same restrictions apply to the single-header file, so there’s really no additional difficulty. And it is really easy to go from the two-file version to the single file by just marking everything in the implementation file as inline and including it in the header.

16 thoughts on “C++ header-only libraries are bad”

  1. Unfortunately you are completely mistaken. Inline only sidesteps ODR violation and has nothing to do in C++ with actual compiler inlining, which is often much more aggressive leading to better optimization. If your system has LTO link time optimization you are even less prone to code duplication. A last effect of templates is that only that code gets instantiated that is actually used.

    1. I know what the inline keyword does. That’s why I often wrote “marked as inline”. Not sure what your criticism is really. Can you elaborate why I’m mistaken?

      1. This really boils down to “header-only libraries are bad because inline”. I’m confused, what point you are trying to make, really. Maybe you do understand the effects of the inline keyword in C++, but you sure managed to misrepresent what it does. It has no impact on program behavior, as you suggest. It strictly acts as a linker directive, inducing well-defined behavior for certain types of ODR violations (as Peter Sommerlad explained in a comment above).
        It’s wildly unclear, why you even care about an implementation detail, that has no impact on client code, at all. It doesn’t change program behavior, and doesn’t adversely affect a compiler’s ability to optimize for certain metrics (like code size). The compiler simply has more information at its disposal to evaluate applicability of certain optimization strategies.
        If you are worried about increased compile (and particularly link) times, then that might be an issue, although tools like Visual Studio support incremental linking, keeping link times down to acceptable rates. Likewise, precompiled headers are supported by all mainstream compilers. While this generates considerably sized databases, it is very likely just a transient issue (if at all). The C++ Modules TS will cope with that, once it becomes generally available.
        So with all that out of the way, could you elaborate on what precisely makes header-only libraries bad? Maybe you do have a point, but even after reading this blog post twice I wasn’t able to see it.

      2. .f – it seems you are not really clear on what consequences inlining has physically (it seems you’re clear on the semantics, good). Let me give you little example: By out-of-lining a couple of destructors, I shrunk the object file size of a code base from 6 gb down to 2.3 gb. Because even when the compiler does not actually inline the code, i.e. when it does not remove the function call, it still has to duplicate the code into ALL the translation units that use it. Only the linker then has the information to deduplicate that. Sure, it does not have a big effect on the final executable, but it sure slows down the compiler and especially the linker to go through roughly 3x the data. This, in turn, slows down everyone working on the code. Even for an organization of medium size, that will cost that organization a lot of time/money.
        If you fear that out-of-line code will slow down your execution times, you can still use LTO and/or PGO.

  2. The conclusion that all code has to be marked inline is wrong (C89 doesn’t even have a keyword for inlining, yet header-only libs are possible). A “proper” header-only lib consists of a declaration part which is always visible when the header is included, and a separate implementation part which is activated with a define in only one source file (see here for instance: https://github.com/nothings/stb/blob/master/stb_image.h#L4).

    1. Oh those are not header-only but “single-file” libs – it’s even what their github says! While I’d rather have those libs as separate implementation and header files – this is basically that, but with a define to switch between the two. C++ header-only libs, on the other hand, just throw everything at the linker-stage by declaring everything as inline. stb does not do that.

  3. No, header only libraries aren’t bad. They aren’t good either, it’s just that they are the only really portable way to write reusable code in C++.

    1. Nah, I think that is exactly the misconception that has to go. E.g. pugixml is is super portable even though it has a cpp file.

    1. Yes, it’s easier to just throw everything at the compiler/linker, but it’s not as hard as people think. Using thin-template requires some planning, yes – but applying type-erasure is usually pretty easy. Like using std::function instead of a template parameter for a callback.

      1. Sure, yea – std::function has some cost. But it could be less than what you replace by it. Specifically, it helps to eliminate the combinatorics explosion the compiler has to deal with, since the real callable type only matters at the point of instantiation of std::function, but not where it’s used.

  4. What about precompiled headers? Just throw one header there and viola, high portability, fast compilation. AFAIK modules for C++ are based on this.

    1. Precompiled headers hardly improve anything. The code in the header is still not properly insulated, e.g. it still drags in all the implementation dependencies and not just the interface dependencies. Also, they are not composable, you can only really have one precompiled header per translation unit.
      Modules, on the other hand, seem like they could solve a lot of the issues I have with single-header libs – they prevent all kinds of leakage: dependencies do not leak, duplicated inlined code does not leak. I really hope they get ready for production sooner rather than later.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.