A Game Optimization War Story

As our customers surely know, I’m not working here on fridays. This is because that’s the time I allocate to my side project, an arcade real-time strategy game called abstractanks. It is a passion project above all else, but of course, I am also learning a lot, much of which I can apply to my “day job” here as well. Today I want to share the story of how I optimized a critical bit of code in that game.

The Big Slowdown

While working on scripted missions, one main element I am using is to make a group of units attack when you enter an area (a.k.a. a zone-trigger). This seems easy enough, but was causing massive slowdowns as soon as the enemy group started moving. My average logic frame-time jumped from 0.3 ms to more than 1500 ms, which essentially makes the game unplayable. When seeing a performance problem, your first instinct should always be to profile it. So I booted up WPR/WPA and did just that. Once I had the profile, I followed the most-sampled path in the stack and found my way to the supposed culprit: the parking algorithm.

Context

When optimizing, you need as much context as you possible to find the best possible course of action. So let me explain how that algorithm fits into the broader picture.

Parking

My main game-mechanic is moving around your units. You do this by selecting a group and then clicking somewhere on the map to issue the move-order. In addition to path-finding process, this also runs an algorithm I call park-planning (as in parking a car). It makes sure that the units know to position themselves around the target point in a roughly circular shape once they arrive. It is essential to the interaction of this mechanic with the capturing of objectives, which are circular as well. Before this was implemented, the units would just decelerate after passing the target point. This caused them to “overshot” and miss the objectives, which was frustrating to the players: they clicked in the right place, but the units would not stop there, but slightly behind it. To make things worse, units arriving later, would bump into those that were already there, further pushing them away and clumping up.

AI Moving

In my particular case, the AI enemy was repeatedly issuing move-orders to close in on the intruder – the player. Since the player group usually also moved, the AI was trying to adapt by changing the move order every frame (effectively working at around 2000 APMs).

Diving into the code

My park-planning implementation is divided into two steps: finding enough parking spots, and then assigning units to it. The profiler was showing that the first part was the problem while the assignment was negligible in terms of run-time. Historically, the first step was reusing and extending some code I first wrote for spawning units, which worked like this:

optional<v2> GameWorld::FindFreePosition(v2 Center, std::vector<v2> const& Occupied)
{
  auto CheckPosition = [&](v2 Candiate)
  {
    if (!IsPassable(Candidate))
      return false;

    if (OverlapsWith(Occupied))
      return false;

    return !FriendlyUnitOccupies(Candidate);   
  };

  if (CheckPosition(Center))
    return Center;

  auto Radius = UNIT_SIZE;
  while (Radius < MAX_SEARCH_RADIUS)
  {
    // Roll a random starting angle
    auto AngleOffset = RandomAngle();
    auto Angle = 0.f;
    while (Angle < 2*Pi)
    {
      auto Candidate = Center + AngleVector(Angle + AngleOffset)*Radius;
      if (CheckPosition(Candidate))
        return Candidate;

      // Move along this circle
      Angle += 2*Pi*Radius / UNIT_SIZE / OVERSAMPLING_FACTOR;
    }

    // Increase the Radius
    Radius += UNIT_SIZE;
  }
  return none;  
}

Note that all the functions in the CheckPosition lambda are “size aware” and respect the UNIT_SIZE – so they are slightly more complex than what the pseudo-code here would have you believe.
The occupied parameter was added for the parking-position finding. It successively fills up the std::vector with positions and uses them once it found enough.

Back to the profiling results: They were showing that most of the time was spent in the FriendlyUnitOccupies, followed by IsPassable and and then OverlapsWith. FriendlyUnitOccupies dominated the time by about 8x times the rest. That function uses a quad-tree to accelerate spatial queries for other units.

Next steps

Obviously, this code uses pretty simplistic approach to the problem – basically just brute-forcing it. But that’s good now there are many different paths to take, many optimization opportunities. My approach was a relatively simple change that got the frame time back down below 1 ms, but before I did that, I considered many and tested a few other different approaches. I will talk about that in detail in my next post. How would you approach this?

C++ header-only libraries are bad

A somewhat more recent trend in the C++ community is the popularity of header-only single-file libraries. Prominent examples are catch2, JSON for Modern C++ and spdlog. These are all great, modern and popular libraries, and I personally enjoy using all of them.

But back to the provoking title. This may be a bit of an over-generalization, and it is meant to be a little bit ambiguous. Mathieu Ropert already pointed out that header-only files are but a symptom of the whole C++ modules and package misery. The aforementioned libraries are all great pieces of software but it is bad that:

they are exclusively header-only
header-only is seen as a sign of quality these days

Historically, header-only libraries have been a thing in C++ because of templates. Templates are not functions or variables that can be referenced by the linker. No, as the name so fittingly suggests, they are just templates for those, with the potential to become, or better, be instantiated into, something that actually survives the trip to the executable code. Header-only libraries used to be code that could only materialized in the context of other code.

But the focus has shifted to portability. I guess by coincidence, people discovered that header-only libraries are also relatively easy to import into your project.

It is actually about inlining

Splitting code between headers and implementation files is a trade off, one that is often synonymous with marking functions inline or not. Inlining is just one more fine-tuning tool that C++ programmers have at their disposal to make the resulting application behave as they want. Carefully considering whether to inline helps to manage compile times, transitive dependencies and code-bloat.

Even for template-heavy libraries, not all of it has to to be inlined. It is often beneficial for compilation-time, code-size and run-time to use techniques such as thin templates to make sure some of the code is properly insulated.

Another way?

Promoting “header-only” as the new buzzword for portability has the side-effect of implying which code is not marked as inline: None.

That is just ignorant of that dimension of the code. It is equivalent to not making a choice about insulation and inlining.

Sure, header-only is marginally better for dropping into your code, but adding a portable implementation file should be just as easy. Why not deliver portable libraries as a single implementation file and a single header instead? Those could easily be generated by a preprocessing step E.g. catch2’s single-header is generated anyways, so it should not be much harder to split that output into two files. Of course the implementation file should be able to work within your compilation environment. But the same restrictions apply to the single-header file, so there’s really no additional difficulty. And it is really easy to go from the two-file version to the single file by just marking everything in the implementation file as inline and including it in the header.

uninitialized_tag in C++

No doubt, C++ is one of those languages you can use to squeeze out every last drop of your CPU’s processing power. On the other hand, it also allows a high amount of abstraction. However, micro-optimization seldom works well with nice abstractions.

The dilemma

One such case is the matter of default-initialization with “math types”, such as three-dimensional vectors used in computer graphics. Do you let your default constructor zero initialize by default or do you leave the elements uninitialized and risk undefined behavior?

One way around this dilemma is to use tag dispatching to enable both:

template <class T> struct v3 {
  v3(T v={}) : x{v}, y{v}, z{v} {}
  v3(uninitialized_tag) {}
  T x,y,z;
};

Now a v3 zero-initializes by default, while you can still avoid the initialization costs by calling it with:

v3<float>{uninitialized_tag{}};

Drawbacks

This approach is not without drawbacks. It’s a bit of an uphill battle to find a good test for this. You need to overwrite the values before you use them, or the compiler is free to do whatever it wants when you use them. It’d be undefined behavior. But you also do not want it to figure out that you are overwriting all the values – because in that case, it can optimize out the zero-initialization anyways.
It does work for a few simple cases though, and you scan see the zero-initialization getting removed, e.g. in the compiler explorer.

However, it will often not let you do what you set out to do – leave some some vectors uninitialized. Consider this:

std::vector<v3<float>> v(N, uninitialized_tag{});

This does not, in fact, transport the uninitialized_tag to the v3 constructor. It first converts the tag to a v3, and then uses that value to initialize all the other N elements with the uninitialized data. This is actually a lot of copying, and creates a whole lot more code than the zero initialization would have. You can get this to work with a container that uses the given initializer value to initialize the elements without converting first. You’re probably better of with a mechanism like the std::vector::reserve that essentially gives you the ability to leave elements uninitialized.

Conclusion

This is a very specialized method for very few niche cases, and you need to carefully select your infrastructure to see any gain that cannot be achieved by simpler means. Use with caution!

Oversimplified C++ Project FAQ 2018

If you are starting a new C++ project, you’re faced with a few difficult decisions. C++ is not a ‘batteries-included’ language, so you need to pick a few technologies before you can start.
Yet worse, the answer to most of the pressing questions is often ‘it depends’ and changing one of the choices mid-project can be very expensive.
Therefore, I have compiled this list to give totally biased and oversimplified to the most important questions. If you want more nuanced answers, feel free to do your own research.
This is meant to be a somewhat amusing starting point.

FAQ

1. Which OS should I pick?

Linux

Rationale

Usually, not a choice you can make yourself – but if you do: dependency management is easier with a package manager, and it seems to be the most dominant OS in the C++ community. Hence you will get the best support and easiest access to technologies.

2. Which build system should I use?

CMake

Rationale

This is what everyone else is using, and those that are not are a real pain. For better or worse, the market is locked in. With target based properties in modern CMake, it’s not even that bad.

3. Which IDE should I choose?

Visual Studio 2017 on Windows, CLion everywhere else.

Rationale

CLion is getting more robust and feature rich with every release. Native CMake support and really cool refactoring capabilities finally make this a valid contender to Visual Studio’s crown. However, the VS debugger is still the best in the game, so VS still comes out on top on Windows – tho not by a huge margin.

4. Which Language version should you use?

C++14

Rationale

C++17 is not quite there yet with library, tool and platform support. Also, people do not really know how to use it well yet. C++14 builds on the now well-established C++11, which a few rather important “fixes” – and support is ubiquitous.

5. Which GUI toolkit should you use?

Rationale

No other toolkit comes close in maturity. Qt’s signal/slot system almost seamlessly integrates with C++11 lambdas, making the precompile step needed for SLOTs a non-issue. Barring the license costs for closed-source projects, there is really no reason not to use it.

6. Should you use Boost?

Rationale

Boost is a huge and clunky dependency that will explode your build times as soon as you even touch it. And it’s ‘viral’ enough that you can distinguish a Boost project from a non-Boost project. Boost.Optional, Boost.Variant and Boost.Filesystem prepare you for a smooth transition to C++17, but there are other more lightweight alternatives available.

Closing thoughts

There you have my totally biased opinion but hopefully entertaining. YMWV, but I think this is a good starting point if you don’t want to exeriment too much.

C++17: The two line visitor explained

If you have ever used an “idiomatic” C++ variant datatype like Boost.Variant or the new C++17 std::variant, you probably wished you could assemble a visitor to dispatch on the type by assembling a couple of lambda expressions like this:

auto my_visitor = visitor{
  [&](int value) { /* ... */ },
  [&](std::string const& value) { /* ... */ },
};

The code in question

While reading through the code for lager I stumbled upon a curious way to to make this happen. And it is just two lines of code! Wow, that is cool.

template<class... Ts> struct visitor: Ts... { using Ts::operator()...; };
template<class... Ts> visitor(Ts...) -> visitor<Ts...>;

A comment in the code indicated that the code was copied from cppreference.com where I quickly found the source on the page for std::visit, albeit with the different name “overloaded”. There were, however, no comments as to how this code worked.

Multiple inheritance to the rescue

Lambda expressions in C++ are just syntactic sugar for callables, pretty much like a struct with an operator(). As such, you can derive from them. which is what the first line does.
It uses variadic templates and multiple inheritance to assemble the types of the lambdas into one type. Without the content in the struct body, an instantiation with our example would be roughly equivalent to this:

struct int_visitor {
  void operator()(int value)
  {/* ... */}
};

struct string_visitor {
  void operator()(std::string const& value)
  {/* ... */}
};

struct visitor : int_visitor, string_visitor {
};

Using all of it

Now this cannot yet be called, as overload resolution (by design) does not work across different types. Hence the using in the structs body. It pulls the operator() implementations into the visitor type where overload resolution can work across all of them.
With it, our hypothetical instantiation becomes:

struct visitor : int_visitor, string_visitor {
  using int_visitor::operator();
  using string_visitor::operator();
};

Now an instance of that type can actually be called with both our types, which is what the interface for, e.g. std::visit demands.

Don’t go without a guide

The second line intruiged me. It looks a bit like a function declaration but that is not what it is. The fact that I had to ask in the (very helpful!) C++ slack made me realize that I did not keep up with the new features in C++17 as much as I would have liked.
This is, in fact, a class template argument deducation (CTAD) guide. It is a new feature in C++17 that allows you do deduce template arguments for a type based on constructor parameters. In a way, it supercedes the Object Generator idiom of old.
The syntax is really quite straight-forward. Given a list of constructor parameter types, resolve to a specific template instance based on those.

Constructing

The last piece of the puzzle is how the visitor gets initialized. The real advantage of using lambdas instead of writing the struct yourself is that you can capture variables from your context. Therefore, you cannot just default-initialize most lambdas – you need to transport its values, its bound context.
In our example, this uses another new C++17 feature: extended aggregate initialization. Aggregate initialization is how you initialized structs way back in C with curly-brackets. Previously, it was forbidden to do this with structs that have a base class. The C++17 extension now lifts this restriction, thus making it possible to initialize this visitor with curly brackets.

Edit 2018/04/16: The people on r/cpp rightfully pointed out that using the “other name” in the code snippet was confusing – so the visitor is now called “visitor”.

Integrating catch2 with CMake and Jenkins

A few years back, we posted an article on how to get CMake, googletest and jenkins to play nicely with each other. Since then, Phil Nash’s catch testing library has emerged as arguably the most popular thing to write your C++ tests in. I’m going to show how to setup a small sample project that integrates catch2, CMake and Jenkins nicely.

Project structure

Here is the project structure we will be using in our example. It is a simple library that implements left-pad: A utility function to expand a string to a minimum length by adding a filler character to the left.

├── CMakeLists.txt
├── source
│   ├── CMakeLists.txt
│   ├── string_utils.cpp
│   └── string_utils.h
├── externals
│   └── catch2
│       └── catch.hpp
└── tests
    ├── CMakeLists.txt
    ├── main.cpp
    └── string_utils.test.cpp

As you can see, the code is organized in three subfolders: source, externals and tests. source contains your production code. In a real world scenario, you’d probably have a couple of libraries and executables in additional subfolders in this folder.

The source folder

set(TARGET_NAME string_utils)

add_library(${TARGET_NAME}
  string_utils.cpp
  string_utils.h)

target_include_directories(${TARGET_NAME}
  INTERFACE ./)

install(TARGETS ${TARGET_NAME}
  ARCHIVE DESTINATION lib/)

The library is added to the install target because that’s what we typically do with our artifacts.

I use externals as a place for libraries that go into the projects VCS. In this case, that is just the catch2 single-header distribution.

The tests folder

I typically mirror the filename and path of the unit under test and add some extra tag, in this case the .test. You should really not need headers here. The corresponding CMakeLists.txt looks like this:

set(UNIT_TEST_LIST
  string_utils)

foreach(NAME IN LISTS UNIT_TEST_LIST)
  list(APPEND UNIT_TEST_SOURCE_LIST
    ${NAME}.test.cpp)
endforeach()

set(TARGET_NAME tests)

add_executable(${TARGET_NAME}
  main.cpp
  ${UNIT_TEST_SOURCE_LIST})

target_link_libraries(${TARGET_NAME}
  PUBLIC string_utils)

target_include_directories(${TARGET_NAME}
  PUBLIC ../externals/catch2/)

add_test(
  NAME ${TARGET_NAME}
  COMMAND ${TARGET_NAME} -o report.xml -r junit)

The list and the loop help me to list the tests without duplicating the .test tag everywhere. Note that there’s also a main.cpp included which only defines the catch’s main function:

#define CATCH_CONFIG_MAIN
#include <catch.hpp>

The add_test call at the bottom tells CTest (CMake’s bundled test-runner) how to run catch. The “-o” switch commands catch to direct its output to a file, report.xml. The “-r” switch sets the report mode to JUnit format. We will need both to integrate with Jenkins.

The top-level folder

The CMakeLists.txt in the top-level folder needs to call enable_testing() for our setup. Other than that, it just directs to the subfolders via add_subdirectory().

Jenkins

Now all that is needed is to setup Jenkins accordingly. Setup jenkins to get your code, add a “CMake Build” build-step. Hit “Add build tool invocation” and check “Use cmake” to let cmake handle the invocation of your build tool (e.g. make). You also specify the target here, which is typically “install” or “package” via the “–target” switch.

Now you add another step that runs the tests via CTest. Add another Build Step, this time “CMake/CPack/CTest Execution” and pick CTest. The one quirk with this is that it will let the build fail when CTest returns a non-zero exit code – which it does when any tests fail. Usually, you want the build to become unstable and not failed if that happens. Hence set “1-65535” in the “Ignore exit codes” input.

The final step is to let jenkins use the report.xml that we had CTest generate so it can generate the test result charts and tables. To do that, add the post-build action: “Publish JUnit test result report” and point it to tests/report.xml.

Done!

That’s it. Now you got your CI running nice catch tests. The code for this example is available on our github.

Keeping connections alive with libcurl

libcurl is quite a comfortable option to transfer files across a variety of network protocols, e.g. HTTP, FTP and SFTP.

It’s really easy to get started: downloading a single file via http or ftp takes only a couple of lines.

Drip, drip..

But as with most powerful abstractions, it is a bit leaky. While it does an excellent job of hiding such steps as name resolution and authentication, these steps still “leak out” by increasing the overall run-time.

In our case, we had five dozen FTP servers and we needed to repeatedly download small files from all of them. To make matters worse, we only had a small time window of 200ms for each transfer.

Now FTP is not the most simple protocol. Essentially, it requires the client to establish a TCP control connection, that it uses negotiate a second data connection and initiate file transfers.

This initial setup phase needs a lot of back and forth between server and client. Naturally, this is quite slow. Ideally, you would want to do the connection setup once and keep both the control and the data connection open for subsequent transfers.

libcurl does not explicitly expose the concept of an active connection. Hence you cannot explicitly tell the library not to disconnect it. In a naive implementation, you would download multiple files by simply creating an easy session object for each file transfer:

for (auto file : FILE_LIST)
{
  std::vector<uint8_t> buffer;
  auto curl = curl_easy_init();
  if (!curl)
    return -1;
  auto url = (SERVER+file);
  curl_easy_setopt(curl, CURLOPT_URL,
    url.c_str());
  curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION,
    appendToVector);
  curl_easy_setopt(curl, CURLOPT_WRITEDATA,
    &buffer);
  if (curl_easy_perform(curl) != CURLE_OK)
    return -1;

  process(buffer);
  curl_easy_cleanup(curl);
}

That does indeed reset the connection for every single file.

Re-use!

However, libcurl can actually keep the connection open as part of a connection re-use mechanism in the session object. This is documented with the function curl_easy_perform. If you simply hoist the easy session object out of the loop, it will no longer disconnect between file transfers:

auto curl = curl_easy_init();
if (!curl)
  return -1;

for (auto file : FILE_LIST)
{
  std::vector<uint8_t> buffer;
  auto url = (SERVER+file);
  curl_easy_setopt(curl, CURLOPT_URL, 
    url.c_str());
  curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, 
    appendToVector);
  curl_easy_setopt(curl, CURLOPT_WRITEDATA, 
    &buffer);
  if (curl_easy_perform(curl) != CURLE_OK)
    return -1;

  process(buffer);
}
curl_easy_cleanup(curl);

libcurl will now cache the active connection in the session object, provided the files are actually on the same server. This improved the download timings of our bulk transfers from 130ms-260ms down to 30ms-40ms, quite the enormous gain. The timings now fit into our 200ms time window comfortably.

Evolution of programming languages

Programming languages evolve over time. They get new language features and their standard library is extended. Sounds great, doesn’t it? We all know not going forward means your go backward.

But I observe very different approaches looking at several programming ecosystems we are using.

Featuritis

Java and especially C# added more and more “me too” features release after release making relatively lean languages quite complex multi-paradigm languages. They started object oriented and added generics, functional programming features and declarative programming (LINQ in C#) and different UI toolkits (AWT, Swing, JavaFx in Java; Winforms, WPF in C#) to the mix.

Often the new language features add their own set of quirks because they are an afterthought and not carefully enough designed.

For me, this lack of focus makes said language less attractive than more current approaches like Kotlin or Go.

In addition, deprecation often has no effect (see Java) where 20 year old code and style still works which increases the burden further . While it is great from a business perspektive in that your effort to maintain compatibility is low it does not help your code base. Different styles and old ways of doing something tend to remain forever.

Revolution

In Grails (I know, it is not a programming language, but I has its own ecosystem) we see more of a revolution. The core concept as a full stack framework stays the same but significant components are changed quite rapidly. We have seen many changes in technology like jetty to tomcat, ivy to maven, selenium-rc to geb, gant to gradle and the list goes on.

This causes many, sometimes subtle, changes in behaviour that are a real pain when maintaining larger applications over many years.

Framework updates are often a time-consuming hassle but if you can afford it your code base benefits and will eventually become cleaner.

Clean(er) evolution

I really like the evolution in C++. It was relatively slow – many will argue too slow – in the past but it has picked up pace in the last few years. The goal is clearly stated and only features that support it make it in:

Make C++ a better language for systems programming and library building
Make C++ easier to teach and learn
Zero-Cost abstractions
better Tool-support

If you cannot make it zero-cost your chances are slim to get your feature in…

C at its core did not change much at all and remained focused on its merits. The upgrades mostly contained convenience features like line comments, additional data type definitions and multithreading.

Honest evolution – breaking backwards compatibility

In Python we have something I would call “honest evolution”. Python 3 introduced some breaking changes with the goal of a cleaner and more consistent language. Python 2 and 3 are incompatible so the distinction in the version number is fair. I like this approach of moving forward as it clearly communicates what is happening and gets rid of the sins in the past.

The downside is that many systems still come with both, a Python 2 and a Python 3 interpreter and accompanying libraries. Fortunately there are some options and tools for your code to mitigate some of the incompatibilities, like the __future__ module and python-six.

At some point in the future (expected in 2020) there will only support for Python 3. So start making the switch.

A Tale of Two Languages

Recently, I presented my mysteriously titled talk “A Tale of Two Languages” at our local C++ user group. Before the talk, I was not really sure whether it would resonate with my audience. But it did, and helped to engage people in a healthy discussion about how to use C++.

Essentially, the talk was about how I am using two different modes or dialects of C++ to write and maintain applications. The title suggests two languages – and it sure can be thought of that way, but for now I’m using the word “modes” to distinguish it from the term programming languages.

You write the application in one mode, while keeping the style relatively easy. In the other mode, you make sure that you can write easy and efficient code in the other, while leveraging the full power of C++.

I call the first application mode and the second library mode.

Library mode

As I said before, this the power mode. One of C++’s design paradigms is self-extension. You are extending the language from the language itself. It’s a very powerful mechanism, the same one that drives the standard library. It’s also why C++ does not have the need for a built-in string type.

This power is a bit of a double-edged sword. On the one hand, it allows you to adapt the language specifically for your needs, for example with application specific value types. For a 3D application, a well designed 2d vector or point type will make your code easier and probably faster. On the other hand, a badly designed type on this level will haunt your application for years to come. I have seen both.

That’s a simple example though. More powerful primitives, such as domain-specific-language like constructs also belong into this mode. In general, things in this mode are less discoverable and less maintainable, but they strive to improve discoverability, efficiency and maintainability on the other side. As a consequence, this code needs more stringent documentation and specification.

Application mode

This is the mode that you use to write the majority of your application. Application mode is all about agility and leveraging opportunities. You intentionally restrict yourself in order to keep your development speed up. Simplicity trumps most other qualities in this mode. If you need another quality to be the defining factor, for example because you need some code to be a little more complex in order to run faster, you should put it into library mode instead.

Unlike code in library mode, this code needs to speak for itself. Therefore, documentation is usually nothing but a duplication.

One important aspect is that this code should be devoid of all subtleties.

Parameter passing and its consequences

That last bit is especially uncommon in C++, where most decisions are really a catch-22. Hence the resulting code hints at the struggles endured while writing it.

For example, to write an efficient function in C++, you need to decide whether to pass each parameter by value, or by reference, or by a pointer. The decision on which to use depends largely on your implementation, i.e. what you are doing with the parameter after it was passed to the function. That usually couples your implementation too tightly to its interface and degrades programmer productivity by giving too many options.

Using a shared-ownership smart-pointer such as std::shared_ptr by default is a good middle ground here. It does the right thing most of the time and is not to far off at most other times. Many other mainstream languages, such as python, go this route. Some frameworks, such as Qt, use that semantic as well.
Like const-correctness, passing all parameters in a std::shared_ptr is viral. Object thus passed need to be created on a the stack, preferably with std::make_shared. You will also store those smart pointers in other objects, so shared_ptr will have quite a lot of screen space. Therefore I usually make an alias:

template using Ptr = std::shared_ptr;

If it’s going to be the default, it should not clutter your code. Since objects are transported in a Ptr by default, they usually do not even need a copy constructor or other “value-like” semantics. These objects are less about maintaining invariants, and more about implementing abstract interfaces and bundling functionality in maintainable chunks. I usually use boost::noncopyable to mark them, though Herb Sutter’s Metaclasses proposal could make this even nicer.
Note that you can still promote them to value types in library mode, should the need arise. But they will become more costly to maintain.

Other simplifications

There are plenty of other things to avoid in application mode. Writing templated types makes your code inherently non-local and dependent on a type that can be anywhere. Note that instantiation of templates from library mode is fine – at that point, all the facts are known.

Another thing that makes your code non-local, and therefore unfitting for application mode, is overloading. Especially in the presence of ADL. For example, which functions are in your actual overload set depends on which headers you include and which using-directives and declarations are active. Sometimes, that is desirable. But not in application mode.

Resolution

Since using this “two modes” approach, I have found that my productivity is much higher – even in older code that went through a lot of evolution. The code does not actually get a lot slower, even with all the smart pointers. In fact, I am sure that I could only optimize a few cases because the design in application mode is a lot more flexible, and the structure more visible.

Do most language make false promises?

Some years ago I stumbled over this interesting article about C being the most effective of programming language and one making the least false promises. Essentially Damien Katz argues that the simplicity of C and its flaws lead to simple, fast and easy to reason about code.

C is the total package. It is the only language that’s highly productive, extremely fast, has great tooling everywhere, a large community, a highly professional culture, and is truly honest about its tradeoffs.

-Damien Katz about the C Programming language

I am Java developer most of the time but I also have reasonable experience in C, C++, C#, Groovy and Python and some other languages to a lesser extent. Damien’s article really made me think for quite some time about the languages I have been using. I think he is right in many aspects and has really good points about the tools and communities around the languages.

After quite some thought I do not completely agree with him.

My take on C

At a time I really liked the simplicity of C. I wrote gtk2hack in my spare time as an exercise and definitely see interoperability and a quick “build, run, debug”-cycle as big wins for C. On the other hand I think while it has a place in hardware and systems programming many other applications have completely different requirements.

A standardized ABI means nothing to me if I am writing a service with a REST/JSON interface or a standalone GUI application.
Portability means nothing to me if the target system(s) are well defined and/or covered by the runtime of choice.
Startup times mean nothing to me if the system is only started once every few months and development is still fast because of hot-code replacement or other means.
etc.

But I am really missing more powerful abstractions and better error handling or ressource management features. Data structures and memory management are a lot more painful than in other languages. And this is not (only) about garbage collection!

Especially C++ is making big steps in the right direction in the last few years. Each new standard release provides additional features making code more readable and less error prone. With zero cost abstractions at the core of language evolution and the secondary aim of ease of use I really like what will come to C++ in the future. And it has a very professional community, too.

Aims for the C++11 effort:

Make C++ a better language for systems programming and library building

Make C++ easier to teach and learn

-Bjarne Stroustup, A Tour of C++

What we can learn from C

Instead of looking down at C and pointing at its flaws we should look at its strengths and our own weaknesses/flaws. All languages and environments I have used to date have their own set of annoyances and gotchas.

Java people should try building simple things and having a keen eye on dependencies especially because the eco system is so rich and crowded. Also take care of ressource management – the garbage collector is only half the deal.

Scala and C++ people should take a look at ABI stability and interoperability in general. Their compile times and “build, run, debug”-cycle has much room for improvement to say the least.

C# may look at simplicity instead of wildly adding new features creating a language without opinion. A plethora of ways implementing the same stuff. Either you ban features or you have to know them all to understand code in a larger project.

Conclusion

My personal answer to the title of this blog: Yes, they make false promises. But they have a lot to offer, too.

So do not settle with the status quo of your language environment or code style of choice. Try to maintain an objective perspective and be aware of the weaknesses of the tools you are using. Most platforms improve over time and sometimes you have to re-evaluate your opinion regarding some technology.

I prefer C++ to C for some time now and did not look back yet. But I also constantly try different languages, platforms and frameworks and try to maintain a balanced view. There are often good reasons to choose one over the other for a particular project.

	Writing Integration… on Every Unit Test Is a Stage Pla…
	mariuselvert on C# is very strict about modify…
	Anonymous on C# is very strict about modify…
	Anonymous on Cache configuration with WildF…
	Miq on Nested queries like N+1 in pra…