4 Tips for better CMake

We are doing one of those list posts again! This time, I will share some tips and insights on better CMake. Number four will surprise you! Let’s hop right in:

Tip #1

model dependencies with target_link_libraries

I have written about this before, and this is still my number one tip on CMake. In short: Do not use the old functions that force properties down the file hierarchy such as include_directories. Instead set properties on the targets via target_link_libraries and its siblings target_compile_definitions, target_include_directories and target_compile_options and “inherit” those properties via target_link_libraries from different modules.

Tip #2

always use find_package with REQUIRED

Sure, having optional dependencies is nice, but skipping on REQUIRED is not the way you want to do it. In the worst case, some of your features will just not work if those packages are not found, with no explanation whatsoever. Instead, use explicit feature-toggles (e.g. using option()) that either skip the find_package call or use it with REQUIRED, so the user will know that another lib is needed for this feature.

Tip #3

follow the physical project structure

You want your build setup to be as straight forward as possible. One way to simplify it is to follow the file system and and the artifact structure of your code. That way, you only have one structure to maintain. Use one “top level” file that does your global configuration, e.g. find_package calls and CPack configuration, and then only defers to subdirectories via add_subdirectory. Only for direct subdirectories though: if you need extra levels, those levels should have their own CMake files. Then build exactly one artifact (e.g. add_executable or add_library) per leaf folder.

Tip #4

make install() an option()

It is often desirable to include other libraries directly into your build process. For example, we usually do this with googletest for our unit test. However, if you do that and use your install target, it will also install the googletest headers. That is usually not what you want! Some libraries handle this automagically by only doing the install() calls when they are the top level project. Similar to the find_package tip above, I like to do this with an option() for explicit user control!

Generating done

That is it for today! I hope this is helps and we will all see better CMake code in the future.

OPC Basics

OPC (Open Platform Communications) is a machine to machine communication protocol for industrial automation. In its simplest form it works like this:

An OPC server defines a set of variables within a directory tree-like hierarchy forming namespaces. Each variable has a data type like integer, boolean, real, string and a default value.

One or many OPC clients connect to the OPC server via a TCP based binary protocol, usually on port 4840. The clients can read and write the OPC variables provided by the server. Clients can also monitor OPC variables for changes so that you don’t have to poll the variables. In code this is usually done by registering a callback function that gets executed when the monitored variable changes.

Handshaking

A simple communication pattern between two OPC clients that we have used in OPC based interfaces is a handshake. This can either be a two-way handshake or a three-way handshake. The two-way handshake is in fact just an acknowledgement: One OPC client sets the value of a variable, the other client reads the variable and resets the value to the default value to confirm that it has read the variable. If you do not want to use the default value to indicate a read confirmation you can also use another variable as a confirmation flag. In a three-way handshake the first client also confirms the confirmation.

OPC UA

The current specification of OPC is OPC UA (Unified Architecture) by the OPC Foundation. It covers a lot more functionality than what is described above. It’s a unified successor to various OPC Classic specifications like OPC DA, A&E and HDA. If you want to get started with OPC UA development you can use one of the many client and server SDKs and toolkits for various programming languages.

Functional tests for Grails with Geb and geckodriver

Previously we had many functional tests using the selenium-rc plugin for Grails. Many were initially recorded using Selenium IDE, then refactored to be more maintainable. These refactorings introduced “driver” objects used to interact with common elements on the pages and runners which improved the API for walking through a multipage process.

Selenium-rc got deprecated quite a while ago and support for Firefox broke every once in a while. Finally we were forced to migrate to the current state-of-the-art in Grails functional testing: Geb.

Generally I can say it is really a major improvement over the old selenium API. The page concept is similar to our own drivers with some nice features:

  • At-Checkers provide a standardized way of checking if we are at the expected page
  • Default and custom per page timeouts using atCheckWaiting
  • Specification of relevant content elements using a JQuery-like syntax and support for CSS-selectors
  • The so-called modules ease the interaction with form elements and the like
  • Much better error messages

While Geb is a real improvement over selenium it comes with some quirks. Here are some advice that may help you in successfully using geb in the context of your (grails) webapplication.

Cross-plattform testing in Grails

Geb (or more specifically the underlying webdriver component) requires a geckodriver-binary to work correctly with Firefox. This binary is naturally platform-dependent. We have a setup with mostly Windows machines for the developers and Linux build slaves and target systems. So we need binaries for all required platforms and have to configure them accordingly. We have simply put them into a folder in our project and added following configuration to the test-environment in Config.groovy:

environments {
  test {
    def basedir = new File(new File('.', 'infrastructure'), 'testing')
    def geckodriver = 'geckodriver'
    if (System.properties['os.name'].toLowerCase().contains('windows')) {
      geckodriver += '.exe'
    }
    System.setProperty('webdriver.gecko.driver', new File(basedir, geckodriver).canonicalPath)
  }
}

Problems with File-Uploads

If you are plagued with file uploads not working it may be a Problem with certain Firefox versions. Even though the fix has landed in Firefox 56 I want to add the workaround if you still experience problems. Add The following to your GebConfig.grooy:

driver = {
  FirefoxProfile profile = new FirefoxProfile()
  // Workaround for issue https://github.com/mozilla/geckodriver/issues/858
  profile.setPreference('dom.file.createInChild', true)
  new FirefoxDriver(profile)
}

Minor drawbacks

While the Geb-DSL is quite readable and allows concise tests the IDE-support is not there. You do not get much code assist when writing the tests and calling functions of the page objects like in our own, code based solution.

Conclusion

After taking the first few hurdles writing new functional tests with Geb really feels good and is a breeze compared to our old selenium tests. Converting them will be a lot work and only happen on a case-by-case basis but our coverage with Geb is ever increasing.

How I start a project – the next steps

After the initial meetings with the project stakeholders my next step is to get a big picture of the processes the project tries to improve. Every (enterprise) software implements processes or workflows stemming from the business side. Since I want to improve the work for the people using the software I take the user’s perspective and try to understand it and describe it from their perspective.
For this task my go-to-tool is the user journey map. My first draft starts with a handful of steps outlining the main functions performed by the major actors. These are normally just 5 – 7 steps and serve as an overview and communication starter:

Often some users or other systems interact inside one process. Important is to concentrate on the big picture, the big steps of the process. These are the ones you need to get right, details and deviations from the process come after that. The point of all the drawings and documents I use is to foster a shared understanding between the stakeholders (including the users) and the team. We as a whole need to talk about the same things with the same language. If one step has different name we have to rename it to match the name used by the stakeholders. This is crucial. The same applies to the names used in the user interface. These must be the ones used by the users, not some internal words, not even synonyms.
From the big picture I iterate through the assumptions and getting more detailed on the way. Assumptions can make or break a project. Even if I am pretty sure I need to verify which means often ask or observe the user. The method for getting a good answer depends on the kind of assumption. But I need to verify. I usually use a wiki to record assumptions in the form of open questions like

  • which device will be used for this process?
  • will step 3 always be the next step?
  • how does the user hold the device?

These question get more detailed in the course of the project. Even when implementing the solution in code questions will arise. These need to be verified. Sometimes the question can be answered by the defined goals: which solution helps the user reaching his goals. If I cannot answer the question from the collected information, I need to ask questions, test it with prototypes, observe the user or do some research.
Usually along the way I use a prototype, mockup or demo to show how I understand the problem of the users. Again this is a great point for build a shared understanding. Sometimes I need a picture and often a simulation of the steps and actions involved. These interactive prototypes help to spark discussions about details I didn’t think of in the first place. Details that matter. Details that when overseen or seen too late can defer or break the project. When there are still in demo form or in a prototype the stakeholders find it easier to discuss things and the amount of work done is minimal. These demos evolve over time and are the basis for the solution later on. This does not mean that I reuse the code, but I reuse the insights gained from them.
Also prototypes are great for spikes evaluating a solution, the feasibility of a technology or testing assumptions. Prototypes are not only for the start of the project but can be used throughout. Sometimes something quick has to be tested without compromising the application as a whole. Here small and separate prototypes can be used.

Recap of the Schneide Dev Brunch 2017-10-08

brunch64-borderedLast sunday, we held another Schneide Dev Brunch, a regular brunch on the second sunday of every other (even) month, only that all attendees want to talk about software development and various other topics. This brunch was smaller, which enabled us to use the meeting table with some comfort. Sometimes, with many attendees and bad weather, this table can get a little bit crowded. As usual, the main theme was that if you bring a software-related topic along with your food, everyone has something to share. Because we were a small group, we discussed without an agenda. As usual, a lot of topics and chatter were exchanged. This recapitulation tries to highlight the main topics of the brunch, but cannot reiterate everything that was spoken. If you were there, you probably find this list inconclusive:

Area of training

We shared some stories about top-notch video game players and how they keep up with the demand to stay competitive. Similar stories can be told about every topic imagineable: What did the “king of the hill” do to rise to such levels? The answer is always: training. Excessive, brutal training. They are first in the gym and lock the door late at night because they are the last, too. The use every waking second for practice and repetition. They are obsessed with being “the best”. If you want to follow such a story in movie format, you might enjoy “Whiplash”, a movie about an aspiring expert drummer that also highlights the delicate trainer/trainee dynamics. If you are more interested in the strategies of obtaining mastery, the book “Mastery” by Robert Greene will give you a lot of insights.

With this background, we asked around what our area of training (not expertise, not mastery – just training) is. The answers varied wildly, from the obvious “programming” to “whisky” (as in whisky tasting and collecting whisky). It’s an interesting question: what goal are you actively pursuing at the moment?

Hacking challenges

Evolving from the first topic, we talked about coding challenges and “capture the flag” hacking contests. If you aren’t the grandmaster in the area of the contest, you’ll get the most out of it by following the participating teams and trying to understand their approaches.  The local security capture the flag team of the KIT is especially open with their approach, their failures and successes. You might want to check out their website.

One challenge included trying to break a whitebox encryption, which is an interesting topic in itself. Maybe somebody can read up on this topic and give a little presentation in the future. Another challenge seemed to lead to an elaborate buffer overflow attack, when in reality, it could be solved with a “simple” use-after-free attack.

An useful starting point for aspiring security hackers might be the CTF (capture the flag) field guide. There are also some online challenges for basic training purposes, like the cryptopals or the bandit wargame. Thanks Tobias for the links!

If you are more interested in playful challenges and don’t want to show up on somebody’s radar, programming/hacking games like TIS-100 are perfect for you. Our game night with TIS-100 is still in vivid memory.

Software Architecture training

There are a lot of programming contests and hacking challenges out there. But what about dry-run training for software architects? On a related scale, there are hundreds of training simulators for the foot soldier (called ego-shooters), but little games for the aspiring officer/general. The website armchair general lists a few and even has some contests, but they lack the depth of real experience. Similarly, the training for software architects will probably be clean-room exercises, when in reality, the customer needs, the team mood, the latest fad in technology and even the weather will influence the architecture just as well as textbook knowledge.

We couldn’t discuss this topic to its full potential, so it will re-appear on the agenda of the next Dev Brunch. And its open for discussion in the comments: What are good books and trainings for software architects?

Thomas pointed us to the Architectural Katas by Ted Neward. Perhaps we should schedule a Schneide Event to try them out?

Epilogue

As usual, the Dev Brunch contained a lot more chatter and talk than listed here. The number of attendees makes for an unique experience every time. We are looking forward to the next Dev Brunch at the Softwareschneiderei in December. We even have some topics still on the agenda (like a report about first-hand experiences with the programming language Rust). And as always, we are open for guests and future regulars. Just drop us a notice and we’ll invite you over next time.

Keeping connections alive with libcurl

libcurl is quite a comfortable option to transfer files across a variety of network protocols, e.g. HTTP, FTP and SFTP.

It’s really easy to get started: downloading a single file via http or ftp takes only a couple of lines.

Drip, drip..

But as with most powerful abstractions, it is a bit leaky. While it does an excellent job of hiding such steps as name resolution and authentication, these steps still “leak out” by increasing the overall run-time.

In our case, we had five dozen FTP servers and we needed to repeatedly download small files from all of them. To make matters worse, we only had a small time window of 200ms for each transfer.

Now FTP is not the most simple protocol. Essentially, it requires the client to establish a TCP control connection, that it uses negotiate a second data connection and initiate file transfers.

This initial setup phase needs a lot of back and forth between server and client. Naturally, this is quite slow. Ideally, you would want to do the connection setup once and keep both the control and the data connection open for subsequent transfers.

libcurl does not explicitly expose the concept of an active connection. Hence you cannot explicitly tell the library not to disconnect it. In a naive implementation, you would download multiple files by simply creating an easy session object for each file transfer:

for (auto file : FILE_LIST)
{
  std::vector<uint8_t> buffer;
  auto curl = curl_easy_init();
  if (!curl)
    return -1;
  auto url = (SERVER+file);
  curl_easy_setopt(curl, CURLOPT_URL,
    url.c_str());
  curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION,
    appendToVector);
  curl_easy_setopt(curl, CURLOPT_WRITEDATA,
    &buffer);
  if (curl_easy_perform(curl) != CURLE_OK)
    return -1;

  process(buffer);
  curl_easy_cleanup(curl);
}

That does indeed reset the connection for every single file.

Re-use!

However, libcurl can actually keep the connection open as part of a connection re-use mechanism in the session object. This is documented with the function curl_easy_perform. If you simply hoist the easy session object out of the loop, it will no longer disconnect between file transfers:

auto curl = curl_easy_init();
if (!curl)
  return -1;

for (auto file : FILE_LIST)
{
  std::vector<uint8_t> buffer;
  auto url = (SERVER+file);
  curl_easy_setopt(curl, CURLOPT_URL, 
    url.c_str());
  curl_easy_setopt(curl, CURLOPT_WRITEFUNCTION, 
    appendToVector);
  curl_easy_setopt(curl, CURLOPT_WRITEDATA, 
    &buffer);
  if (curl_easy_perform(curl) != CURLE_OK)
    return -1;

  process(buffer);
}
curl_easy_cleanup(curl);

libcurl will now cache the active connection in the session object, provided the files are actually on the same server. This improved the download timings of our bulk transfers from 130ms-260ms down to 30ms-40ms, quite the enormous gain. The timings now fit into our 200ms time window comfortably.

Using PostgreSQL with Entity Framework

The most widespread O/R (object-relational) mapper for the .NET platform is the Entity Framework. It is most often used in combination with Microsoft SQL Server as database. But the architecture of the Entity Framework allows to use it with other databases as well. A popular and reliable is open-source SQL database is PostgreSQL. This article shows how to use a PostgreSQL database with the Entity Framework.

Installing the Data Provider

First you need an Entity Framework data provider for PostgreSQL. It is called Npgsql. You can install it via NuGet. If you use Entity Framework 6 the package is called EntityFramework6.Npgsql:

> Install-Package EntityFramework6.Npgsql

If you use Entity Framework Core for the new .NET Core platform, you have to install a different package:

> Install-Package Npgsql.EntityFrameworkCore.PostgreSQL

Configuring the Data Provider

The next step is to configure the data provider and the database connection string in the App.config file of your project, for example:

<configuration>
  <!-- ... -->

  <entityFramework>
    <providers>
      <provider invariantName="Npgsql"
         type="Npgsql.NpgsqlServices, EntityFramework6.Npgsql" />
    </providers>
  </entityFramework>

  <system.data>
    <DbProviderFactories>
      <add name="Npgsql Data Provider"
           invariant="Npgsql"
           description="Data Provider for PostgreSQL"
           type="Npgsql.NpgsqlFactory, Npgsql"
           support="FF" />
    </DbProviderFactories>
  </system.data>

  <connectionStrings>
    <add name="AppDatabaseConnectionString"
         connectionString="Server=localhost;Database=postgres"
         providerName="Npgsql" />
  </connectionStrings>

</configuration>

Possible parameters in the connection string are Server, Port, Database, User Id and Password. Here’s an example connection string using all parameters:

Server=192.168.0.42;Port=5432;Database=mydatabase;User Id=postgres;Password=topsecret

The database context class

To use the configured database you create a database context class in the application code:

class AppDatabase : DbContext
{
  private readonly string schema;

  public AppDatabase(string schema)
    : base("AppDatabaseConnectionString")
  {
    this.schema = schema;
  }

  public DbSet<User> Users { get; set; }

  protected override void OnModelCreating(DbModelBuilder builder)
  {
    builder.HasDefaultSchema(this.schema);
    base.OnModelCreating(builder);
  }
}

The parameter to the super constructor call is the name of the configured connection string in App.config. In this example the method OnModelCreating is overridden to set the name of the used schema. Here the schema name is injected via constructor. For PostgreSQL the default schema is called “public”:

using (var db = new AppDatabase("public"))
{
  var admin = db.Users.First(user => user.UserName == "admin")
  // ...
}

The Entity Framework mapping of entity names and properties are case sensitive. To make the mapping work you have to preserve the case when creating the tables by putting the table and column names in double quotes:

create table public."Users" ("Id" bigserial primary key, "UserName" text not null);

With these basics you’re now set up to use PostgreSQL in combination with the Entity Framework.