Integrating conan, CMake and Jenkins

In my last posts on conan, I explained how to start migrating your project to use a few simple conan libraries and then how to integrate a somewhat more complicated library with custom build steps.

Of course, you still want your library in CI. We previously advocated simply adding some dependencies to your source tree, but in other cases, we provisioned our build-systems with the right libraries on a system-level (alternatively, using docker). Now using conan, this is all totally different – we want to avoid setting up too many dependencies on our build-system. The fewer dependencies they have, the less likely they will accidentally be used during compilation. This is crucial to implement portability of your artifacts.

Setting up the build-systems

The build systems still have to be provisioned. You will at least need conan and your compiler-suite installed. Whether to install CMake is a point of contention – since the CMake-Plugin for Jenkins can do that.

Setting up the build job

The first thing you usually need is to configure your remotes properly. One way to do this is to use conan config install command, which can synchronize remotes (or the whole of the conan config) from either a folder, a zip file or a git repository. Since I like to have stuff readable in plain text in my repository, I opt to store my remotes in a specific folder. Create a new folder in your repository. I use ci/conan_config in this example. In it, place a remotes.txt like this:

bincrafters https://api.bintray.com/conan/bincrafters/public-conan True
conan-center https://conan.bintray.com True

Note that conan needs a whole folder, you cannot read just this file. Your first command should then be to install these remotes:

conan config install ci/conan_config

Jenkins’ CMake for conan

The next step prepares for installing our dependencies. Depending on whether you’re building some of those dependencies (the --build option), you might want to have CMake available for conan to call. This is a problem when using the Jenkins CMake Plugin, because that only gives you cmake for its specific build steps, while conan simply uses the cmake executable by default. If you’re provisioning your build-systems with conan or not building any dependencies, you can skip this step.
One way to give conan access to the Jenkins CMake installation is to run a small CMake script via a “CMake/CPack/CTest execution” step and have it configure conan appropriatly. Create a file ci/configure_for_conan.cmake:

execute_process(COMMAND conan config set general.conan_cmake_program=\"${CMAKE_COMMAND}\")

Create a new “CMake/CPack/CTest execution” step with tool “CMake” and arguments “-P ci/configure_for_conan.cmake”. This will setup conan with the given cmake installation.

Install dependencies and build

Next run the conan install command:

mkdir build && cd build
conan install .. --build missing

After that, you’re ready to invoke cmake and the build tool with an additional “CMake Build” step. The build should now be up and running. But who am I kidding, the build is always red on first try 😉

Using protobuf with conan and CMake

In my last post, I showed how I got my feet wet while migrating the dependencies of my existing code-base to conan. The first major hurdle I saw coming when I started was adding something with a “special” build step, e.g. something like source-preprocessing. In my case, this was protobuf, where a special build-step converts .proto files to sources and headers.

In my previous solution, my devenv build scripts would install the protobuf converter binary to my devenv’s bin/ folder, which I then used to run my preprocessing. At first, it was not obvious how to do this with conan. It turns out that the lovely people and bincrafters made this pretty comfortable. conan_basic_setup() will add all required package paths to your CMAKE_MODULE_PATH, which you can use to include() some bundled CMake scripts that will either let you execute the protobuf-compiler via a target or run protobuf_generate to automagically handle the preprocessing. It’s probably worth noting, that this really depends on how the package is made. Conan does not really have an official way on how to handle this.

Let’s start with some sample code – Person.proto, like the sample from the protobuf website:

message Person {
  required string name = 1;
  required int32 id = 2;
  optional string email = 3;
}

And some sample code that uses it:

#include "Person.pb.h"

int main(int argn, char** argv)
{
  Person message;
  message.set_name("Hello Protobuf");
  std::cout << message.name() << std::endl;
}

Again, we’re using the bincrafters repository for our dependencies in a conanfile.txt:

[requires]
protobuf/3.6.1@bincrafters/stable
protoc_installer/3.6.1@bincrafters/stable

[options]

[generators]
cmake

Now we just need to wire it all up in the CMakeLists.txt

cmake_minimum_required(VERSION 3.0)
project(ProtobufTest)

include(${CMAKE_BINARY_DIR}/conanbuildinfo.cmake)
conan_basic_setup(TARGETS KEEP_RPATHS)

# This loads the cmake/protoc-config.cmake file
# from the protoc_installer dependency
include(cmake/protoc-config)

set(TARGET_NAME ProtobufSample)

# Just add the .proto files to the target
add_executable(${TARGET_NAME}
  Person.proto
  ProtobufTest.cpp
)

# Let this function to the magic
protobuf_generate(TARGET ${TARGET_NAME})

# Need to use protobuf, of course
target_link_libraries(${TARGET_NAME}
  PUBLIC CONAN_PKG::protobuf
)

# Make sure we can find the generated headers
target_include_directories(${TARGET_NAME}
  PUBLIC ${CMAKE_CURRENT_BINARY_DIR}
)

There you have it! Pretty neat, and all without a brittle find_package call.

Migrating an existing C++ codebase to conan

This is a bit of a battle report of migrating the dependencies in my C++ projects to use the conan package manager.
In the past weeks I have started to use conan in half a dozen both work and personal projects. Here’s my experiences so far.

Before

The first real project I started with was my personal game project. The “before” setup used a mixture if techniques to handle dependencies and uses CMake to do most of the heavy lifting.
Most dependencies reside in the “devenv”, which is a separate CMake project that I use to build and bundle the dependencies in a specific installation folder. It uses ExternalProject_Add for most parts (e.g. Boost, SDL, Lua, curl and OpenSSL), add_subdirectory for a few others (pugixml and lz4) and just install(FILES...) for a few header only libs like JSON for Modern C++, Catch2 and spdlog. It should be noted that there are relatively few interdependencies between the projects in there.
Because it is more convenient to update, I keep a few dependencies that I control myself directly in the source tree, either as git externals or just copies of the source files.
I try to keep usage of system dependencies to a minimum so that the resulting binary is more portable to the average gamer who does not want to know about libraries and dependencies and such nonsense. This setup has been has been mostly painless and working for my three platforms Windows, Linux and Mac – at least as long as I did not try to change it significantly.

Baby steps

Since not all my dependencies are available on conan and small iterations are usually more successful, I decided to proceed by changing only a single dependency to conan. For this dependency, it’s a good idea to pick something that does not have many compile-time options and is more or less platform agnostic. So I opted for boost over, e.g. SDL or wxWidgets. Boost was also one of the most painful dependencies to build, if only for the insane amount of files it produces and the time it takes to copy those ten-thousands of files to the install location.

Getting started..

There are currently two popular variants of boost available through conan. The “normal” variant on conan’s main repository/remote “conan-center” and a modular version that splits boost into its component libraries on the bincrafters remote, e.g. Boost.Filesystem. The modular version is more appealing conceptually, and I also had a better time getting it to work in my first tests, so I picked that. I did a quick grep for #include <boost/ through my code for an initial guess which boost libraries I needed to get and created a corresponding conanfile.txt in my project root.

[requires]
boost_filesystem/1.69.0@bincrafters/stable
boost_math/1.69.0@bincrafters/stable
boost_random/1.69.0@bincrafters/stable
boost_property_tree/1.69.0@bincrafters/stable
boost_assign/1.69.0@bincrafters/stable
boost_heap/1.69.0@bincrafters/stable
boost_optional/1.69.0@bincrafters/stable
boost_program_options/1.69.0@bincrafters/stable
boost_iostreams/1.69.0@bincrafters/stable
boost_system/1.69.0@bincrafters/stable

[options]
boost:shared=False

[generators]
cmake

Now conan plays really nice with “single configuration generators” like the new CMake/Ninja support in VS2017 and onward. Basically, just cd into your build dir and call something like conan install -s build_type=Debug -s build_type=x86 whenever you want to update dependencies. More info can be found in the official documentation. The workflow for CLion is essentially the same.

Using it in your build

After the last command, conan will download (or build) the dependencies and generate a file with all the corresponding paths.
To use it, include it from cmake like this:

include(${CMAKE_BINARY_DIR}/conanbuildinfo.cmake)
conan_basic_setup(TARGETS KEEP_RPATHS)

It will then provide targets for all the requested boost libraries that you can link to like this:

target_link_libraries(myTarget
  PUBLIC CONAN_PKG::boost_filesystem
)

I wanted to make sure that the compiler build using the new boost files and not the old ones. Because I have a generic include into my devenv that was still going to be in my compilers include-paths for all the other dependencies, so I just renamed boost’s header include folder on disc. After my first successful compile I felt confident enough to delete them.

First problems

There was one major problem: some of my in-source dependencies had their own claim on using boost via passed CMake variables, Boost_LIBRARY_DIRS and Boost_INCLUDE_DIR. I adapted their CMakeLists.txt to allow for injecting appropriate targets instead. Not the cleanest solution, but it got my builds green again fast.

There’s a still a lot to cover on this: The other platforms had their own quirks and I migrated way more than just this first project. Also, there is still ways to go for a full migration with my game project. But more on that in my next blog post…

Game Optimization Resolved

In my last blog post, I explained a performance problem in my game abstractanks but not how I solved it.

So I had not done any optimization work in a while, so the first thing I did turned out to be an error. And not only in hindsight – I actually knew how to tackle a problem like that – I just temporarily forgot at that point.

Going down the rabbit hole

Where we left off, my profiler showed FriendlyUnitOccupies as the culprit. That function basically does circle/circle collision detection using a quad-tree as the spatial acceleration structure. Looking at the samples from my profiler, I could see that that it was descending into the tree quite deeply. Like all tree structures, a quad-tree does pointer-chasing which is very bad for modern CPUs. So I figured I should look at how to optimize that. The data structure was implemented in a hurry, so there seemed plenty to do:

  • Instead of recursing into each node, use tail-call optimization and early culling to speed up traversal.
  • Pre-cache the query with the max-search radius and the other requirements to the units, e.g. not dead, same team, etc.. and then use that to build a new tree for the actual queries.

Because the data structure was pretty non-generic, I started to basically rewrite it to use it in this scenario. While I was about half way through with that, it dawned on me that I was barking at the wrong tree.

Taking a step back

The excellent book Video Game Optimization has some great advice on which level to attack an optimization problem.

  1. System-level. Can you change the system to do something differently and still solve your problem?
  2. Algorithm-level. Are you using the most efficient right algorithm for the data you have?
  3. Micro-level. Are you not wasting any processing power on the lower levels?

I was already on the algorithm level. So I went back to the systemic level: What if the AI did not try to change the target position that often, maybe just every few seconds? That effectively meant lowering the AIs APM. It’s not a bad solution, especially since that makes the AI behave more human. But on the other hand, real-time games, as the name implies, have a soft real-time requirement. So you generally like to avoid huge workloads that go over your frame budget. With how slow the algorithm was, that could easily be the case. The solution is then to do the work concurrently, either by splitting it up or doing it in the background. Both solutions seemed difficult, since the AI code does currently not allow for easy concurrency. So that idea was out.

What if the parking-positions where cached? Subsequent calls to get parking positions could probably reuse a lot of the positions that were computed in previous frames, given that the target point only moves by a little bit each frame. I figured that might work, but it requires more housekeeping and data-dependencies – the result of the previous query needs to be used for the next. That seemed complex and therefore brittle.

A Solution?

Temporal coherency was a pretty good idea though, but not the scale was to big this time. What if I exploited it within a single frame? Now the original code did obscure this, but maybe it gets a little more clear if I write it like this:

optional<v2> GameWorld::FindFreePosition(v2 Center, std::vector<v2> const& Occupied)
{
  auto CheckPosition = [&](v2 Candiate)
  {
    if (!IsPassable(Candidate))
      return false;

    if (OverlapsWith(Occupied))
      return false;

    return !FriendlyUnitOccupies(Candidate);  
  };
  auto Samples = SampledPositions(Center, SomeRandomness());
  auto Found = find_if(SampledPositions.begin(), SampledPositions.end(), CheckPosition(Position));
  
  return (Found != SampledPositions.end()) ? *Found : none;
}

Now as I explained in the previous post, this was called in a loop for each unit to be parked.

std::vector<v2> GameWorld::FindParkingPositions(v2 Center, std::size_t N)
{
  std::vector<v2> Results;
  for (std::size_t i = 0; i < N; ++i)
  {
    auto MaybePosition = FindFreePosition(Center, Results);
    if (!MaybePosition) // No more free space?
      break;
    Results.push_back(*MaybePosition);
  }
  return Results;
}

Easy to see: counting the number of CheckPosition calls, this algorithm is O(n) in number of sampled positions. The number of sampled positions depends linearly on the number of units to be parked, because more units obviously need more parking positions, essentially making this O(n²) for the unit count! But the positions get resampled for each unit – with the only change being the little bit of randomness that is injected everytime. In other words, each call would just test false for sampled positions roughly corresponding to the units that are already placed.

So what I did was a very small change: only inject the randomness once and merge the loops:

auto Samples = SampledPositions(Center, SomeRandomness());
std::vector<v2> Results;

for (auto const& Sample : Samples)
{
  if (CheckPosition(Sample))
    Results.push_back(Sample);

  if (Result.size() >= N)
    break;
}
return Results;

And this did the trick! The algorithm’s run-time when below the 1ms range, and the smaller variation in randomness is not really visible.

Conslusions

I was thrown off-track be the false conclusion that CheckPositions was too slow when it was in fact just called too often. Context is key! Always approach these things outside-in.
Using less-than-optimal abstractions obscured the opportunity to hoist out the sample generation from me. Iteration is always a separate concern, even when it is not on containers!

Advanced deb-packaging with CMake

CMake has become our C/C++ build tool of choice because it provides good cross-platform support and very reasonable IDE (Visual Studio, CLion, QtCreator) integration. Another very nice feature is the included packaging support using the CPack module. It allows to create native deployable artifacts for a plethora of systems including NSIS-Installer for Windows, RPM and Deb for Linux, DMG for Mac OS X and a couple more.

While all these binary generators share some CPACK-variables there are specific variables for each generator to use exclusive packaging system features or requirements.

Deb-packaging features

The debian package management system used not only by Debian but also by Ubuntu, Raspbian and many other Linux distributions. In addition to dependency handling and versioning packagers can use several other features, namely:

  • Specifying a section for the packaged software, e.g. Development, Games, Science etc.
  • Specifying package priorities like optional, required, important, standard
  • Specifying the relation to other packages like breaks, enhances, conflicts, replaces and so on
  • Using maintainer scripts to customize the installation and removal process like pre- and post-install, pre- and post-removal
  • Dealing with configuration files to protect end user customizations
  • Installing and linking files and much more without writing shell scripts using ${project-name}.{install | links | ...} files

All these make the software easier to package or easier to manage by your end users.

Using deb-features with CMake

Many of the mentioned features are directly available as appropriately named CMake-variables all starting with CPACK_DEBIAN_.  I would like to specifically mention the CPACK_DEBIAN_PACKAGE_CONTROL_EXTRA variable where you can set the maintainer scripts and one of my favorite features: conffiles.

Deb protects files under /etc from accidental overwriting by default. If you want to protect files located somewhere else you specify them in a file called conffiles each on a separate line:

/opt/myproject/myproject.conf
/opt/myproject/myproject.properties

If the user made changes to these files she will be asked what to do when updating the package:

  • keep the own version
  • use the maintainer version
  • review the situation and merge manually.

For extra security files like myproject.conf.dpkg-dist and myproject.conf.dpkg-old are created so no changes are lost.

Unfortunately, I did not get the linking feature working without using maintainer scripts. Nevertheless I advise you to use CMake for your packaging work instead of packaging using the native debhelper way.

It is much more natural for a CMake-based project and you can reuse much of your metadata for other target platforms. It also shields you from a lot of the gory details of debian packaging without removing too much of the power of deb-packages.

Keeping connections alive with libcurl

libcurl is quite a comfortable option to transfer files across a variety of network protocols, e.g. HTTP, FTP and SFTP.

It’s really easy to get started: downloading a single file via http or ftp takes only a couple of lines.

Drip, drip..

But as with most powerful abstractions, it is a bit leaky. While it does an excellent job of hiding such steps as name resolution and authentication, these steps still “leak out” by increasing the overall run-time.

In our case, we had five dozen FTP servers and we needed to repeatedly download small files from all of them. To make matters worse, we only had a small time window of 200ms for each transfer.

Now FTP is not the most simple protocol. Essentially, it requires the client to establish a TCP control connection, that it uses negotiate a second data connection and initiate file transfers.

This initial setup phase needs a lot of back and forth between server and client. Naturally, this is quite slow. Ideally, you would want to do the connection setup once and keep both the control and the data connection open for subsequent transfers.

libcurl does not explicitly expose the concept of an active connection. Hence you cannot explicitly tell the library not to disconnect it. In a naive implementation, you would download multiple files by simply creating an easy session object for each file transfer:

for (auto file : FILE_LIST)
{
  std::vector<uint8_t> buffer;
  auto curl = curl_easy_init();
  if (!curl)
    return -1;
  auto url = (SERVER+file);
  curl_easy_setopt(curl, CURLOPT_URL,
    url.c_str());
  curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION,
    appendToVector);
  curl_easy_setopt(curl, CURLOPT_WRITEDATA,
    &buffer);
  if (curl_easy_perform(curl) != CURLE_OK)
    return -1;

  process(buffer);
  curl_easy_cleanup(curl);
}

That does indeed reset the connection for every single file.

Re-use!

However, libcurl can actually keep the connection open as part of a connection re-use mechanism in the session object. This is documented with the function curl_easy_perform. If you simply hoist the easy session object out of the loop, it will no longer disconnect between file transfers:

auto curl = curl_easy_init();
if (!curl)
  return -1;

for (auto file : FILE_LIST)
{
  std::vector<uint8_t> buffer;
  auto url = (SERVER+file);
  curl_easy_setopt(curl, CURLOPT_URL, 
    url.c_str());
  curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, 
    appendToVector);
  curl_easy_setopt(curl, CURLOPT_WRITEDATA, 
    &buffer);
  if (curl_easy_perform(curl) != CURLE_OK)
    return -1;

  process(buffer);
}
curl_easy_cleanup(curl);

libcurl will now cache the active connection in the session object, provided the files are actually on the same server. This improved the download timings of our bulk transfers from 130ms-260ms down to 30ms-40ms, quite the enormous gain. The timings now fit into our 200ms time window comfortably.

Do most language make false promises?

Some years ago I stumbled over this interesting article about C being the most effective of programming language and one making the least false promises. Essentially Damien Katz argues that the simplicity of C and its flaws lead to simple, fast and easy to reason about code.

C is the total package. It is the only language that’s highly productive, extremely fast, has great tooling everywhere, a large community, a highly professional culture, and is truly honest about its tradeoffs.

-Damien Katz about the C Programming language

I am Java developer most of the time but I also have reasonable experience in C, C++, C#, Groovy and Python and some other languages to a lesser extent. Damien’s article really made me think for quite some time about the languages I have been using. I think he is right in many aspects and has really good points about the tools and communities around the languages.

After quite some thought I do not completely agree with him.

My take on C

At a time I really liked the simplicity of C. I wrote gtk2hack in my spare time as an exercise and definitely see interoperability and a quick “build, run, debug”-cycle as big wins for C. On the other hand I think while it has a place in hardware and systems programming many other applications have completely different requirements.

  • A standardized ABI means nothing to me if I am writing a service with a REST/JSON interface or a standalone GUI application.
  • Portability means nothing to me if the target system(s) are well defined and/or covered by the runtime of choice.
  • Startup times mean nothing to me if the system is only started once every few months and development is still fast because of hot-code replacement or other means.
  • etc.

But I am really missing more powerful abstractions and better error handling or ressource management features. Data structures and memory management are a lot more painful than in other languages. And this is not (only) about garbage collection!

Especially C++ is making big steps in the right direction in the last few years. Each new standard release provides additional features making code more readable and less error prone. With zero cost abstractions at the core of language evolution and the secondary aim of ease of use I really like what will come to C++ in the future. And it has a very professional community, too.

Aims for the C++11 effort:

  • Make C++ a better language for systems programming and library building
  • Make C++ easier to teach and learn

-Bjarne Stroustup, A Tour of C++

What we can learn from C

Instead of looking down at C and pointing at its flaws we should look at its strengths and our own weaknesses/flaws. All languages and environments I have used to date have their own set of annoyances and gotchas.

Java people should try building simple things and having a keen eye on dependencies especially because the eco system is so rich and crowded. Also take care of ressource management – the garbage collector is only half the deal.

Scala and C++ people should take a look at ABI stability and interoperability in general. Their compile times and “build, run, debug”-cycle has much room for improvement to say the least.

C# may look at simplicity instead of wildly adding new features creating a language without opinion. A plethora of ways implementing the same stuff. Either you ban features or you have to know them all to understand code in a larger project.

Conclusion

My personal answer to the title of this blog: Yes, they make false promises. But they have a lot to offer, too.

So do not settle with the status quo of your language environment or code style of choice. Try to maintain an objective perspective and be aware of the weaknesses of the tools you are using. Most platforms improve over time and sometimes you have to re-evaluate your opinion regarding some technology.

I prefer C++ to C for some time now and did not look back yet. But I also constantly try different languages, platforms and frameworks and try to maintain a balanced view. There are often good reasons to choose one over the other for a particular project.