C++ Coroutines on Windows with the Fiber API

Last week, I had the chance to try out coroutines as a way to cooperatively interleave long tasks with event-processing. Unlike threads, where you can have interaction between between threads at any time, coroutines need to yield control explicitly, whic arguably makes synchronisation a little simpler. Especially in (legacy) systems that are not designed for concurrency. Of course, since coroutines do not run at the same time, you do not get the perks from concurrency either.

If you don’t know coroutines, think of them as functions that can be paused and resumed.

Unlike many other languages, C++ does not have built-in support for coroutines just yet. There are, however, several alternatives. On Windows, you can use the Fiber API to implement coroutines easily.

Here’s some example code of how that works:

auto coroutine=make_shared<FiberCoroutine>();
coroutine->setup([](Coroutine::Yield yield)
{
  for (int i=0; i<3; ++i)
  {
    cout << "Coroutine " 
         << i << std::endl;
    yield();
  }
});

int stepCount = 0;
while (coroutine->step())
{
  cout << "Main " 
       << stepCount++ << std::endl;
}

Somewhat surprisingly, at least if you have never seen coroutines, this will output the two outputs alternatingly:

Coroutine 0
Main 0
Coroutine 1
Main 1
Coroutine 2
Main 2

Interface

Since fibers are not the only way to implement coroutines and since we want to keep our client code nicely insulated from the windows API, there’s a pure-virtual base class as an interface:

class Coroutine
{
public:
  using Yield = std::function<void()>;
  using Run = std::function<void(Yield)>;

  virtual ~Coroutine() = default;
  virtual void setup(Run f) = 0;
  virtual bool step() = 0;
};

Typically, creation of a Coroutine type allocates all the resources it needs, while setup “primes” it with an inner “Run” function that can use an implementation-specific “Yield” function to pass control back to the caller, which is whoever calls step.

Implementation

The implementation using fibers is fairly straight-forward:

class FiberCoroutine
  : public Coroutine
{
public:
  FiberCoroutine()
  : mCurrent(nullptr), mRunning(false)
  {
  }

  ~FiberCoroutine()
  {
    if (mCurrent)
      DeleteFiber(mCurrent);
  }

  void setup(Run f) override
  {
    if (!mMain)
    {
      mMain = ConvertThreadToFiber(NULL);
    }
    mRunning = true;
    mFunction = std::move(f);

    if (!mCurrent)
    {
      mCurrent = CreateFiber(0,
        &FiberCoroutine::proc, this);
    }
  }

  bool step() override
  {
    SwitchToFiber(mCurrent);
    return mRunning;
  }

  void yield()
  {
    SwitchToFiber(mMain);
  }

private:
  void run()
  {
    while (true)
    {
      mFunction([this]
                { yield(); });
      mRunning = false;
      yield();
    }
  }

  static VOID WINAPI proc(LPVOID data)
  {
    reinterpret_cast<FiberCoroutine*>(data)->run();
  }

  static LPVOID mMain;
  LPVOID mCurrent;
  bool mRunning;
  Run mFunction;
};

LPVOID FiberCoroutine::mMain = nullptr;

The idea here is that the caller and the callee are both fibers: lightweight threads without concurrency. Running the coroutine switches to the callee’s, the run function’s, fiber. Yielding switches back to the caller. Note that it is currently assumed that all callers are from the same thread, since each thread that participates in the switching needs to be converted to a fiber initially, even the caller. The current version only keeps the fiber for the initial thread in a single static variable. However, it should be possible to support this by replacing the single static fiber pointer with a map that maps each thread to its associated fiber.

Note that you cannot return from the fiberproc – that will just terminate the whole thread! Instead, just yield back to the caller and either re-use or destroy the fiber.

Assessment

Fiber-based coroutines are a nice and efficient way to model non-linear control-flow explicitly, but they do not come without downsides. For example, while this example worked flawlessly when compiled with visual studio, Cygwin just terminates without even an error. If you’re used to working with the visual studio debugger, it may surprise you that the caller gets hidden completely while you’re in the run function. The run functions stack completely replaces the callers stack until you call yield(). This means that you cannot find out who called step(). On the other hand, if you’re actually doing a lot of processing in the run function, this is quite nice for profiling, as the “processing” call tree seemingly has its own root in the call-tree.

I just wish the visual studio debugger had a way to view the states of the different fibers like it has for threads.

Alternatives

  • On Linux, you can use the ucontext.
  • Visual Studio 2015 also has another, newer, implementation.
  • Coroutines can be implemented using threads and condition-variables.
  • There’s also Boost.Coroutine, if you need an independent implementation of the concept. From what I gather, they only use Fibers optionally, and otherwise do the required “trickery” themselves. Maybe this even keeps the caller-stack visible – it is certainly worth exploring.
  • Generating a spherified cube in C++

    In my last post, I showed how to generate an icosphere, a subdivided icosahedron, without any fancy data-structures like the half-edge data-structure. Someone in the reddit discussion on my post mentioned that a spherified cube is also nice, especially since it naturally lends itself to a relatively nice UV-map.

    The old algorithm

    The exact same algorithm from my last post can easily be adapted to generate a spherified cube, just by starting on different data.

    cube

    After 3 steps of subdivision with the old algorithm, that cube will be transformed into this:

    split4

    Slightly adapted

    If you look closely, you will see that the triangles in this mesh are a bit uneven. The vertical lines in the yellow-side seem to curve around a bit. This is because unlike in the icosahedron, the triangles in the initial box mesh are far from equilateral. The four-way split does not work very well with this.

    One way to improve the situation is to use an adaptive two-way split instead:
    split2

    Instead of splitting all three edges, we’ll only split one. The adaptive part here is that the edge we’ll split is always the longest that appears in the triangle, therefore avoiding very long edges.

    Here’s the code for that. The only tricky part is the modulo-counting to get the indices right. The vertex_for_edge function does the same thing as last time: providing a vertex for subdivision while keeping the mesh connected in its index structure.

    TriangleList
    subdivide_2(ColorVertexList& vertices,
      TriangleList triangles)
    {
      Lookup lookup;
      TriangleList result;
    
      for (auto&& each:triangles)
      {
        auto edge=longest_edge(vertices, each);
        Index mid=vertex_for_edge(lookup, vertices,
          each.vertex[edge], each.vertex[(edge+1)%3]);
    
        result.push_back({each.vertex[edge],
          mid, each.vertex[(edge+2)%3]});
    
        result.push_back({each.vertex[(edge+2)%3],
          mid, each.vertex[(edge+1)%3]});
      }
    
      return result;
    }
    

    Now the result looks a lot more even:
    split2_sphere

    Note that this algorithm only doubles the triangle count per iteration, so you might want to execute it twice as often as the four-way split.

    Alternatives

    Instead of using this generic of triangle-based subdivision, it is also possible to generate the six sides as subdivided patches, as suggested in this article. This approach works naturally if you want to have seams between your six sides. However, that approach is more specialized towards this special geometry and will require extra “stitching” if you don’t want seams.

    Code

    The code for both the icosphere and the spherified cube is now on github: github.com/softwareschneiderei/meshing-samples.

    Generating an Icosphere in C++

    If you want to render a sphere in 3D, for example in OpenGL or DirectX, it is often a good idea to use a subdivided icosahedron. That often works better than the “UVSphere”, which means simply tesselating a sphere by longitude and latitude. The triangles in an icosphere are a lot more evenly distributed over the final sphere. Unfortunately, the easiest way, it seems, is to generate such a sphere is to do that in a 3D editing program. But to load that into your application requires a 3D file format parser. That’s a lot of overhead if you really need just the sphere, so doing it programmatically is preferable.

    At this point, many people will just settle for the UVSphere since it is easy to generate programmatically. Especially since generating the sphere as an indexed mesh without vertex-duplicates further complicates the problem. But it is actually not much harder to generate the icosphere!
    Here I’ll show some C++ code that does just that.

    C++ Implementation

    We start with a hard-coded indexed-mesh representation of the icosahedron:

    struct Triangle
    {
      Index vertex[3];
    };
    
    using TriangleList=std::vector<Triangle>;
    using VertexList=std::vector<v3>;
    
    namespace icosahedron
    {
    const float X=.525731112119133606f;
    const float Z=.850650808352039932f;
    const float N=0.f;
    
    static const VertexList vertices=
    {
      {-X,N,Z}, {X,N,Z}, {-X,N,-Z}, {X,N,-Z},
      {N,Z,X}, {N,Z,-X}, {N,-Z,X}, {N,-Z,-X},
      {Z,X,N}, {-Z,X, N}, {Z,-X,N}, {-Z,-X, N}
    };
    
    static const TriangleList triangles=
    {
      {0,4,1},{0,9,4},{9,5,4},{4,5,8},{4,8,1},
      {8,10,1},{8,3,10},{5,3,8},{5,2,3},{2,7,3},
      {7,10,3},{7,6,10},{7,11,6},{11,0,6},{0,1,6},
      {6,1,10},{9,0,11},{9,11,2},{9,2,5},{7,2,11}
    };
    }
    

    icosahedron
    Now we iteratively replace each triangle in this icosahedron by four new triangles:

    subdivision

    Each edge in the old model is subdivided and the resulting vertex is moved on to the unit sphere by normalization. The key here is to not duplicate the newly created vertices. This is done by keeping a lookup of the edge to the new vertex it generates. Note that the orientation of the edge does not matter here, so we need to normalize the edge direction for the lookup. We do this by forcing the lower index first. Here’s the code that either creates or reused the vertex for a single edge:

    using Lookup=std::map<std::pair<Index, Index>, Index>;
    
    Index vertex_for_edge(Lookup& lookup,
      VertexList& vertices, Index first, Index second)
    {
      Lookup::key_type key(first, second);
      if (key.first>key.second)
        std::swap(key.first, key.second);
    
      auto inserted=lookup.insert({key, vertices.size()});
      if (inserted.second)
      {
        auto& edge0=vertices[first];
        auto& edge1=vertices[second];
        auto point=normalize(edge0+edge1);
        vertices.push_back(point);
      }
    
      return inserted.first->second;
    }
    

    Now you just need to do this for all the edges of all the triangles in the model from the previous interation:

    TriangleList subdivide(VertexList& vertices,
      TriangleList triangles)
    {
      Lookup lookup;
      TriangleList result;
    
      for (auto&& each:triangles)
      {
        std::array<Index, 3> mid;
        for (int edge=0; edge<3; ++edge)
        {
          mid[edge]=vertex_for_edge(lookup, vertices,
            each.vertex[edge], each.vertex[(edge+1)%3]);
        }
    
        result.push_back({each.vertex[0], mid[0], mid[2]});
        result.push_back({each.vertex[1], mid[1], mid[0]});
        result.push_back({each.vertex[2], mid[2], mid[1]});
        result.push_back({mid[0], mid[1], mid[2]});
      }
    
      return result;
    }
    
    using IndexedMesh=std::pair<VertexList, TriangleList>;
    
    IndexedMesh make_icosphere(int subdivisions)
    {
      VertexList vertices=icosahedron::vertices;
      TriangleList triangles=icosahedron::triangles;
    
      for (int i=0; i<subdivisions; ++i)
      {
        triangles=subdivide(vertices, triangles);
      }
    
      return{vertices, triangles};
    }
    

    There you go, a customly subdivided icosphere!
    icosphere

    Performance

    Of course, this implementation is not the most runtime-efficient way to get the icosphere. But it is decent and very simple. Its performance depends mainly on the type of lookup used. I used a map instead of an unordered_map here for brevity, only because there’s no premade hash function for a std::pair of indices. In pratice, you would almost always use a hash-map or some kind of spatial structure, such as a grid, which makes this method a lot tougher to compete with. And certainly feasible for most applications!

    The general pattern

    The lookup-or-create pattern used in this code is very useful when creating indexed-meshes programmatically. I’m certainly not the only one who discovered it, but I think it needs to be more widely known. For example, I’ve used it when extracting voxel-membranes and isosurfaces from volumes. It works very well whenever you are creating your vertices from some well-defined parameters. Usually, it’s some tuple that describes the edge you are creating the vertex on. This is the case with marching cubes or marching tetrahedrons. It can, however, also be grid coordinates if you sparsely generate vertices on a grid, for example when meshing heightmaps.

    Explicit types – and when to use them

    Many modern programming languages offer a way declare variables without an explicit type if the type can be inferred, either dynamically or statically. Many also allow for variables to be explicitly defined with a type. For example, Scala and C# let you omit the explicit variable type via the var keyword, but both also allow defining variables with explicit types. I’m coming from the C++ world, where “auto” is available for this purpose since the relatively recent C++11. However, people are still debating whether you should actually use it.

    Pros

    Herb Sutter popularised the almost-always-auto style. He advocates that using more type inference is good because it is roughly equivalent to programming against interfaces instead of implementations. He says that “Overcommitting to explicit types makes code less generic and more interdependent, and therefore more brittle and limited.” However, he also mentions that you might sometimes want to use explicit types.

    Now what exactly is overcommiting here? When is the right time to use explicit types?

    Cons

    Opponents to implicit typing, many of them experienced veterans, often state that they want the actual type visible in the source code. They don’t want to rely on type inference being right. They want the code to explicitly state what’s going on.

    At first, I figured that was just conservatism in the face of a new “scary” feature that they did not fully understand. After all, IDEs can usually infer the type on-the-fly and you can hover on a variable to let it show you the type.

    For C++, the function signature is a natural boundary where you often insert explicit types, unless you want to commit to the compile time and physical dependency cost that comes with templates. Other languages, such as Groovy, do not have this trade-off and let you skip explicit types almost everywhere. After working with Groovy/Grails for a while, where the dominant style seems to be to omit types whereever possible, it dawned on me that the opponents of implicit typing have a point. Not only does the IDE often fail to show me the inferred type (even though it still works way more often than I would have anticipated), but I also found it harder to follow and modify code that did not mention explicit types. Seemingly contrary to Herb Sutter’s argument, that code felt more brittle than I had liked.

    Middle-ground

    As usual, the truth seems to be somewhere in the middle. I propose the following rule for when to use explicit types:

    • Explicit typing for domain-types
    • Implicit typing everywhere else

    Code using types from the problem domain should be as specific as possible. There’s no need for it to be generic – it’s actually counter-productive, as otherwise the code model would be inconsistent with model of the problem domain. This is also the most important aspect to grok when reading code, so it should be explicit. The type is as important as the action on it.

    On the other hand, for pure-fabrication types that do not respresent a concept in the domain, the action is important, while the type is merely a means to achieve this action. Typically, most of the elements from a language’s standard library fall into this category. All your containers, iterators, callables. Their types are merely implementation details: an associative container could be an array, or a hash-map or a tree structure. Exchanging it rarely changes the meaning of the code in the problem domain – it just changes its performance characteristics.

    Containers will occasionally contain domain-types in their type. What do you do about those? I think they belong in the “everywhere else” catergory, but you should be take extra care to name the contained type when working with it – for example when declaring the variable of the for-each loop on it, or when inserting something into it. This way, the “collection of domain-type” aspect will become clear, but the specific container implementation will stay implicit – like it should.

    What do you think? Is this a useful proposition for your code?

    Recap of the Schneide Dev Brunch 2016-04-10

    If you couldn’t attend the Schneide Dev Brunch at 10th of April 2016, here is a summary of the main topics.

    brunch64-borderedLast sunday, we held another Schneide Dev Brunch, a regular brunch on the second sunday of every other (even) month, only that all attendees want to talk about software development and various other topics. In case you miss the recap article about the february brunch: It didn’t happen. We all took a break, but are on track again. So if you bring a software-related topic along with your food, everyone has something to share. We were quite a lot of developers this time, so we had enough stuff to talk about. As usual, a lot of topics and chatter were exchanged. This recapitulation tries to highlight the main topics of the brunch, but cannot reiterate everything that was spoken. If you were there, you probably find this list inconclusive:

    Why software development conferences?

    We began with a curious question: Why are there even conferences about software development? You can read most of the content for free on the internet and even watch the talks afterwards. So why attend one for a lot of money? We discussed the topic a bit and came up with an analysis:
    There are (at least) four different interested groups in a conference:

    • The organizer or commercial host is mostly interested in a positive revenue. As long as there’s a possibility for some net gain, somebody will host a conference. The actual topic is a secondary matter for them (this might explain some of the weirder conferences out there, like the boring conference).
    • The developers that really attend a conference are a small subset of all developers. They all have their own personal motives to pay money and invest time and inconviences to be there in person. Some might rely on the quality filter of a conference board, some are looking forward to meet their peers in an annual ritual. There might be those that can learn best if somebody talk-feeds them the topic. Whatever reason, a lot of developers enjoy participating at conferences. If it happens to be paid by the employer and booked as worktime, who would not?
    • Then there are the speakers. They have the additional burden to convince a committee of their topic, prepare a talk of high quality and be able to perform on stage (something that is harder than it looks). The speakers seek reputation and credible proof of expertise. His resume will probably profit, too.
    • And at last, the companies that sponsor the conference, maintain a booth with big roll-ups and smiling employees and give their developers a chance to attend are in the game to represent, to recruit and build their brand. A lot of traditional marketing effort goes into trade fairs, so why not treat the developer market like any other and be present in the developer fairs?

    We can conclude that software development conferences can provide value for every associated stakeholder. As long as this sentence holds true, conferences will be held.
    The question didn’t came out of the blue: one of our attendees got accepted as a speaker on the Karlsruher Entwicklertag 2016 and wanted to learn about the different expectations he needs to address. He will give his talk on the next Dev Brunch to practice the flow and to pass the hardest critics. The topic: git internals. We are thrilled!

    Stratagems and strategies

    The next topic contained another talk, not at a conference, but in the context of a “general topics” series at a local university (the Duale Hochschule in Karlsruhe). The talk introduces the concept of the 36 stratagems and of modern strategies to the audience. We talked a bit about the concept itself and found that the list of logical fallacies is somewhat similar. We even found an application of the stratagems in local history (sorry, only german source found): The Bretten’s Hundle
    The talk itself is this monday, so you’ll need to hurry if you want to attend.

    Psychology of deception

    As often during the dev brunch, one topic led to the other, and we soon talked about morale and ethics. The concept of micro-expressions to reveal the hidden agenda of others came up, as well as the TV series “lie to me” that is inspired by the work of Paul Ekman, a professor of psychology. There even is a commercial training program to improve your skill of “spotting the liar”.

    Games with morale aspects

    Well, we are nerds. While crime investigation is thrilling, there is the even more enthralling topic of games with psychological and moralistic aspects. We soon exchanged our experiences with games like “Haze” or “Spec Ops: The Line”. But it doesn’t stop at shooter games, you can have similar insights by playing “Papers, Please” (a strong favorite for one of our next Schneide game nights) or “This War Of Mine”. You can even try some multiplayer games specifically designed for social insights, like “The Ship: Murder Party”.
    And if you haven’t got much time but still want to learn something about yourself, little games like “60 Seconds!” are a great start.
    This topic lead to some ideas for upcoming Schneide game nights in 2016.

    Book review: A tour of C++

    One attendee of the brunch provided a summary of the book “A Tour of C++” from Bjarne Stroustrup, that recently got updated to the language possibilities of C++ 11. In his words, the book is a rather incomplete introduction to the language, with way too many aspects described in a way too short manner. It’s more of a reading list to really grasp the concepts, so it may serve as a source of inspiration. For example, the notion of “move semantics” was explained, but to discover the consequences is up to the developer. The part about template programming was well done and every chapter has a suitable list of advices in the tradition of “Effective XYZ” at the end. So it’s not a bad book, but too short to be satisfying. It’s like a tourist’s tour around C++ 11, so the title holds its promise.

    The left-pad incident

    When we finished the “official” agenda, the topic of the recent left-pad incident came up and left us laughing. We really live in glorious times when the happiness of the (Javascript) world depends on a few lines of code. Not that this couldn’t happen in any other ecosystem.

    Epilogue

    As usual, the Dev Brunch contained a lot more chatter and talk than listed here. The number of attendees makes for an unique experience every time. We are looking forward to the next Dev Brunch at the Softwareschneiderei. And as always, we are open for guests and future regulars. Just drop us a notice and we’ll invite you over next time.

    Simple C++11 – Part III – Best friends

    Now that we got the whole rigid setup of how to create a compile unit and a class setup out-of-the-way, we can finally start to write some code. What separates simple modern C++ code from the old ways is the degree of abstraction you can use to write your code. Previously, you had to think in memory and instructions. Now, powerful abstractions and language mechanisms help you to think in values and operations, and still get down to the bare metal of the machine when you need to. Here’s my personal set of “best friend” language and library features that helps me be as expressive as possible in the lower-level application code and still leverage the raw power of C++.

    std::vector<T>

    With all its simplicity, it is still powerful enough to handle the greater part of all memory management issues. Better yet, it maps excellently to modern hardware and even when used naively, it is often extremely efficient. And in the rare cases when it is not, the performance can usually be easily improved by using std::vector::reserve.

    With C++11, you can now even toss it around, nest it and return huge vectors from functions without any performance problems. Also, initializer_lists make it easy to fill them with data.

    std::vector<int> my_special_numbers() {
      return {4, 8, 15, 16, 23, 42};
    }
    

    Such code is no longer a subtle performance problem, but actually encouraged.

    There’s no doubt that whenever you need a container, std::vector should be your first candidate.

    for-each

    Printing a range like that is now easy. No need to even know about the existence of iterators or use counters:

    for (auto&& number : my_special_values()) {
      std::cout << number << std::endl;
    }
    

    std::unordered_map<K,V>

    For the rare cases when a flat vector will just not suffice, this neat hash-map will make your life easier. C++11’s initializer syntax makes it a lot cleaner to fill these than before:

    std::unordered_map<std::string, int>
    my_icecream_ratings() {
      return {
        {"vanilla", 3},
        {"chocolate", 9},
        {"strawberry", 8},
        {"raspberry", 7},
        {"lemon", 3}
      };
    }
    

    auto

    And now working with them becomes nice and easy too:

    auto ratings = my_icecream_ratings();
    ratings.insert({"caramel", 2});
    std::cout << "Chocolate was a "
      << ratings["chocolate"];
    

    You can even change the result type to an unordered_multimap or something similar and the code will still work.

    std::shared_ptr<T>

    In a perfect or, should I say, functional world, shared ownership would not be a thing. Pointers or even references would not exist. It just makes things a lot more complex than a clear ownership. It just appears that when requirements change, this or that object is no longer exclusively owned by that other object. Or the lifetime of an object cannot easily be scoped in the presence of multithreading. When this happens, and std::shared_ptr will make your tasks bearable. This is as close as you usually get to completely automatic lifetime management in C++.

    void save_image_in_background(
      std::shared_ptr<image const> raw_image) {
      auto thread = std::thread([raw_image]{
        raw_image.save("raw.png");
      });
     
      thread.detach();
    }
    

    I like to think of pointers as a necessary evil. Sometimes, the alternative just makes things even more confusing, and when that happens, you at least don’t want manual resource management in the way.

    Of course, std::unique_ptr seems to a powerful competitor for shared_ptr’s tasks, but in my experience, you very rarely need a single-ownership pointer in application code. Why not use a moveable type instead? unique_ptr can be useful as a helper to implement library primitives, but you should rarely encounter one in application-level code.

    Less is more

    Note how many fancy C++11 features did not make my list. For example, lambdas are very useful – and I even used one in my shared_ptr example. But they should be used in moderation. They allow to define code out-of-place, to be executed whenever. This makes it harder to reason about them.
    Likewise, things like variadic templates are great for library code, but rarely help in application level.

    This ends my small series on C++ for now. I hope I have shown how concentrating on a few simple features helps you write more maintainable and less obscure C++ code, on a level of abstraction that is not lower than most comparable languages. Do you have other methods to achieve this? Or do you even want to have this? I’d like to hear!

    Simple C++11 – Part II – Class declarations

    In the previous part, I’ve shown my guidelines for setting up compilation units. When writing simple application code with C++11, either classes or free-functions should be your main building blocks. Therefor, in this part, I will focus on what to look out for when writing class declarations.

    While templates can be very useful, they do not scale well as the code base gets larger. Metaprogramming or other niche styles have their places, too, but I like to look at those as a means to create language extensions rather than principal implementation tools.

    Avoid inline implementations

    …especially in header files. It can be tempting to write classes solely in the header file. In fact, it has almost become a sign of quality for parts of C++ code to be header only. But this scales badly in most cases, and evolving such a code-base will result in a dramatic explosion of compile times. Always splitting classes into a declaration and definition acts as a first-level compile- firewall and dependency-breaker. Users of your class no longer need to worry about changes in the implementation of the member functions of that class. Note that those changes are often indirect: a change only affects a class that is used in the implementation of your class’ member functions. By splitting the declaration and definition, users of your class do not have to be recompiled.

    But why stop at the compiler? The same argument holds for programmers. If you start to split interface and implementation on this level, you automatically provide ‘reader-firewalls’ as well. By just providing a clean header file, you are giving readers sort of a manual for your class. No need to look at the implementation at all, if the interface is well-defined.

    Inline code definition is also the main reason against excessive use of templates. Yes, they grant a lot of flexibility, but you pay a hefty price which needs to be justified by an enormous reduction of complexity elsewhere. In general, templates are a bit too powerful for their own good, which is why they need extra moderation.

    Always declare implicit functions

    Implicitly declared functions seem comfortable, but they have a few implications that are hard to understand. First of, if an implicit function gets generated for your class, it will be generated as inline. This means that the implementation becomes a dependency to all users of your class. This can have very subtle effects such as this:

    #include <vector>
    class Entry;
    
    class EntryManager {
    public:
      EntryManager(EntryGenerator& generator);
      int getEntryCount() const;
      std::string getIDForEntry(int index) const;
    private:
      std::vector<Entry> mData;
    };
    

    On the surface, it looks like there should be no dependency (other than the name) on MyEntry when including this header. But there is!
    The destructor is not declared so it will get generated – as inline. Because deletion of a vector requires the held type to be complete, any place that needs to be able to destruct a MyEntryManager also needs to know how to destruct MyEntry, which is not intended at all. Remember there’s a total of six functions that can be implicitly generated! Because of that, there are analogous problems for copy-construction, assignment, move-construction and move-assignment.

    To avoid these problems, either delete the function explicitly in the header, default it in the implementation file, or actually implement it. You rarely need to do the latter, so I advise to default all the ones you need, and delete the rest:

    #include <vector>
    class Entry;
    
    class EntryManager {
    public:
      EntryManager(EntryGenerator& generator);
      EntryManager(EntryManager const&)=delete;
      EntryManager& operator=(EntryManager const&)=delete;
      EntryManager(EntryManager&& rhs);
      EntryManager& operator=(EntryManager&& rhs);
      ~EntryManager();
      int getEntryCount() const;
      std::string getIDForEntry(int index) const;
    private:
      std::vector<MyEntry> mData;
    };
    

    And somewhere in the implementation file:

    EntryManager::EntryManager(EntryManager&& rhs) = default;
    EntryManager::~EntryManager() = default;
    EntryManager& EntryManager::operator=(EntryManager&& rhs) = default;
    

    This has another nice side effect because the vector-template gets instantiated into that object file and does not “bloat” all use-sites.

    Exactly one public function and one private data section per class

    ..starting with the public section. This is where you address the next programmer that has to read your class. And it should be the only place for him to look.

    I avoid private member functions because they cannot be tested easily and can add hidden compile-time dependencies to a project. Why should a user of your class recompile if you change an implementation detail? For small and trivial implementation helpers, the unnamed-namespace in the implementation file is a much better place. If those helpers become larger or more complex, it is a better idea to implement them in a collaborating class, which can be tested and reused.

    Protected member functions split your interface to two parts, one exclusively for derived classes and one for everyone (including derived classes). This is very rarely needed, and in almost all of those cases, a separate interface will scale better (although it is slightly harder to implement).

    Either an interface or an implementation

    So far, I have left inheritance out of the picture and only talked about concrete classes. Inheritance is actually rarely needed, composition often suffices. But if it is needed, make sure that a class is either concrete and final (implementations), or has a complete and minimal set of pure-virtual member functions (interfaces). This will result in shallow hierarchies and easily understood interfaces. Remember that inheritance is not a tool for sharing code from the classes you implement, but for the code using those classes – i.e. where the Liskov Substition Principle holds.

    Now it gets really easy to implement new classes in the hierarchy: Just implement all the functions in the interface. No more questioning whether to leave the default behaviour or override. You will also automatically tend towards clearer separation of components – things that need to be polymorphic move to the interface, other  functionality merely uses it.

    This pattern is useful even when polymorphy is not needed. Such small interfaces devoid of any implementation detail can act as another compiler firewall. Collaborators can work with just the interface and do not have to be recompiled when the implementation changes. Also, the interface can be implemented for mock or fake objects in testing.

    Conclusion

    This concludes the second part of the series. I originally intended it to be about how to write a whole class, but that would have been too much to digest for one post. I am well aware that some of these guidelines can stir quite the controversy in the C++ community. For example, declaring the implicit functions seems to be in conflict with the recently popular rule of zero. Scott Meyers had similar concerns, but does not quite touch the inline aspect.

    For me personally, these guidelines have helped tremendously, especially when scaling to bigger code-bases. But as before, I am curious what others are thinking about this!

    Simple C++11 – Part I – Unit Structure

    C++ has long had the stigma of an overlay complex and unproductive language. Lately, with the advent of C++11, things have brightened a bit, but there are still a lot of misconceptions about the language. I think this is mostly because C++ was taught in a wrong way. This series aims to show my, hopefully somewhat simpler, way of using C++11.

    Since it is typically the first thing I do when starting a new project, I will start with how I am setting up a new compile unit, e.g. a header and compile unit pair.

    Note that I will try not to focus on a specific C++11 paradigm, such as object-oriented or imperative. This structure seems to work well for all kinds of paradigms. But without much further ado, here’s the header file for my imaginary “MyUnit” unit:

    MyUnit.hpp

    #pragma once
    
    #include <vector>
    #include "MyStuff.hpp"
    
    namespace MyModule { namespace MyUnit {
    
    /** Does something only a good bar could.
    */
    std::vector<float> bar(int fooCount);
    
    /** Foo is an integral part of any program.
        Be sure to call it frequently.
    */
    void foo(MyStuff::BestType somethingGood);
    
    }}
    

    I prefer the .hpp file ending for headers. While I’m perfectly fine with .h, I think it is helpful to differentiate pure C headers from C++ headers.

    #pragma once

    I’m using #pragma once here instead of include guards. It is not an official part of the standard, but all the big compilers (Visual C++, g++ and clang) support it, making it a de-facto standard. Unlike include guards, you only have to add only one line, which says exactly what you want to achieve with it. You do not have to find a unique identifier for your include guard that will most certainly break if you rename the file/unit. It’s more readable, more resilient to change and easier to set up.

    Namespaces

    I like to have all the contents of a unit in a single namespace. The actual structure of the namespaces – i.e. per unit or per module or something else entirely depends on the specifics of the project, but filling more than one namespace is a guarantee for chaos. It’s usually a sign that the unit should be broken up into smaller pieces. An exception to this would be the infamous “detail” namespace, as seen in many of the Boost libraries. In that case, the namespace is not used to structure the API, but to explicitly omit things from the API that have to be visible for technical reasons.

    Documentation

    Documentation goes into the header, not into the implementation. The header describes the API, not only to the compiler, but also to humans. It is by no means an implementation detail, but part of the seam that isolates it from the rest of the code. Note that this part of the documentation concerns the API contract only, never the implementation. That part goes into the .cpp file.

    But now to the implementation file:

    MyUnit.cpp

    #include "MyUnit.hpp"
    
    #include "CoolFunctionality.hpp"
    
    using namespace MyModule;
    using namespace MyUnit;
    
    namespace {
    
    int helperFunction(float rhs)
    {
      /* ... */
    }
    
    }// namespace
    
    std::vector<float> MyUnit::bar(int fooCount)
    {
      /* ... */
    }
    
    void MyUnit::foo(MyStuff::BestType somethingGood)
    {
      /* ... */
    }
    
    

    Own #include first

    The only rule I have for includes is that the unit’s own include is always the first. This is to test whether the header is self-sufficient, i.e. that it will compile without being in the context of other headers or, even worse, code from an implementation file. Some people like to order the rest of their includes according to their “origin”, e.g. sections for system headers or library headers. I think imposing any extra order here is not needed. If anything, I prefer not waste time sorting include directives and just append an include when I need it.

    Using namespace

    I choose using-directives of my unit’s namespaces over explicitly accessing the namespaces each time. Unlike the headers, the implementation file lives in a locally defined context. Therefore, it is not a problem to use a very specific view onto the unit. In fact, it would be a problem to be overly generic. The same argument also holds for other “local” modules that this unit is only using, as long as there are no collisions. I avoid using namespaces from external libraries to mark the library boundary (such as std, boost etc.).

    Unnamed namespace

    The unnamed namespace contains all the implementation helpers specific to this unit. It is quite common for this to contain a lot of the “meat” of an actual unit, while the unit’s visible functions merely wrap and canonize the functionality implemented here. I try to keep only one unnamed namespace in each file, to have a clear separation of what is supposed to be visible to the outside – and what is not.

    Visible implementation

    The implementation of the visible API of the module is the most obvious part of the .cpp file. For consistency reasons, the order of the functions should be the same as in the header.

    I’d advice against implementing in a file wide open namespace. That means balancing an unnecessary pair of parenthesis over the whole implementation file.  Also, you can not only define functions and types, but also declare them – this leads to a function further down in the implementation to see a different namespace than one before it.

    Conclusion

    This concludes the first part. I’ve played with the thought of using a 3-piece setup instead, extending the header/implementation with a unit-test file, but have not gathered any sharable experience yet. This setup, however, has worked for me for a long time and with many different projects. Have you had similar – or completely different – setups that worked for you? Do tell!