Simple C++11 – Part II – Class declarations

In the previous part, I’ve shown my guidelines for setting up compilation units. When writing simple application code with C++11, either classes or free-functions should be your main building blocks. Therefor, in this part, I will focus on what to look out for when writing class declarations.

While templates can be very useful, they do not scale well as the code base gets larger. Metaprogramming or other niche styles have their places, too, but I like to look at those as a means to create language extensions rather than principal implementation tools.

Avoid inline implementations

…especially in header files. It can be tempting to write classes solely in the header file. In fact, it has almost become a sign of quality for parts of C++ code to be header only. But this scales badly in most cases, and evolving such a code-base will result in a dramatic explosion of compile times. Always splitting classes into a declaration and definition acts as a first-level compile- firewall and dependency-breaker. Users of your class no longer need to worry about changes in the implementation of the member functions of that class. Note that those changes are often indirect: a change only affects a class that is used in the implementation of your class’ member functions. By splitting the declaration and definition, users of your class do not have to be recompiled.

But why stop at the compiler? The same argument holds for programmers. If you start to split interface and implementation on this level, you automatically provide ‘reader-firewalls’ as well. By just providing a clean header file, you are giving readers sort of a manual for your class. No need to look at the implementation at all, if the interface is well-defined.

Inline code definition is also the main reason against excessive use of templates. Yes, they grant a lot of flexibility, but you pay a hefty price which needs to be justified by an enormous reduction of complexity elsewhere. In general, templates are a bit too powerful for their own good, which is why they need extra moderation.

Always declare implicit functions

Implicitly declared functions seem comfortable, but they have a few implications that are hard to understand. First of, if an implicit function gets generated for your class, it will be generated as inline. This means that the implementation becomes a dependency to all users of your class. This can have very subtle effects such as this:

#include <vector>
class Entry;

class EntryManager {
public:
  EntryManager(EntryGenerator& generator);
  int getEntryCount() const;
  std::string getIDForEntry(int index) const;
private:
  std::vector<Entry> mData;
};

On the surface, it looks like there should be no dependency (other than the name) on MyEntry when including this header. But there is!
The destructor is not declared so it will get generated – as inline. Because deletion of a vector requires the held type to be complete, any place that needs to be able to destruct a MyEntryManager also needs to know how to destruct MyEntry, which is not intended at all. Remember there’s a total of six functions that can be implicitly generated! Because of that, there are analogous problems for copy-construction, assignment, move-construction and move-assignment.

To avoid these problems, either delete the function explicitly in the header, default it in the implementation file, or actually implement it. You rarely need to do the latter, so I advise to default all the ones you need, and delete the rest:

#include <vector>
class Entry;

class EntryManager {
public:
  EntryManager(EntryGenerator& generator);
  EntryManager(EntryManager const&)=delete;
  EntryManager& operator=(EntryManager const&)=delete;
  EntryManager(EntryManager&& rhs);
  EntryManager& operator=(EntryManager&& rhs);
  ~EntryManager();
  int getEntryCount() const;
  std::string getIDForEntry(int index) const;
private:
  std::vector<MyEntry> mData;
};

And somewhere in the implementation file:

EntryManager::EntryManager(EntryManager&& rhs) = default;
EntryManager::~EntryManager() = default;
EntryManager& EntryManager::operator=(EntryManager&& rhs) = default;

This has another nice side effect because the vector-template gets instantiated into that object file and does not “bloat” all use-sites.

Exactly one public function and one private data section per class

..starting with the public section. This is where you address the next programmer that has to read your class. And it should be the only place for him to look.

I avoid private member functions because they cannot be tested easily and can add hidden compile-time dependencies to a project. Why should a user of your class recompile if you change an implementation detail? For small and trivial implementation helpers, the unnamed-namespace in the implementation file is a much better place. If those helpers become larger or more complex, it is a better idea to implement them in a collaborating class, which can be tested and reused.

Protected member functions split your interface to two parts, one exclusively for derived classes and one for everyone (including derived classes). This is very rarely needed, and in almost all of those cases, a separate interface will scale better (although it is slightly harder to implement).

Either an interface or an implementation

So far, I have left inheritance out of the picture and only talked about concrete classes. Inheritance is actually rarely needed, composition often suffices. But if it is needed, make sure that a class is either concrete and final (implementations), or has a complete and minimal set of pure-virtual member functions (interfaces). This will result in shallow hierarchies and easily understood interfaces. Remember that inheritance is not a tool for sharing code from the classes you implement, but for the code using those classes – i.e. where the Liskov Substition Principle holds.

Now it gets really easy to implement new classes in the hierarchy: Just implement all the functions in the interface. No more questioning whether to leave the default behaviour or override. You will also automatically tend towards clearer separation of components – things that need to be polymorphic move to the interface, other  functionality merely uses it.

This pattern is useful even when polymorphy is not needed. Such small interfaces devoid of any implementation detail can act as another compiler firewall. Collaborators can work with just the interface and do not have to be recompiled when the implementation changes. Also, the interface can be implemented for mock or fake objects in testing.

Conclusion

This concludes the second part of the series. I originally intended it to be about how to write a whole class, but that would have been too much to digest for one post. I am well aware that some of these guidelines can stir quite the controversy in the C++ community. For example, declaring the implicit functions seems to be in conflict with the recently popular rule of zero. Scott Meyers had similar concerns, but does not quite touch the inline aspect.

For me personally, these guidelines have helped tremendously, especially when scaling to bigger code-bases. But as before, I am curious what others are thinking about this!

Simple C++11 – Part I – Unit Structure

C++ has long had the stigma of an overlay complex and unproductive language. Lately, with the advent of C++11, things have brightened a bit, but there are still a lot of misconceptions about the language. I think this is mostly because C++ was taught in a wrong way. This series aims to show my, hopefully somewhat simpler, way of using C++11.

Since it is typically the first thing I do when starting a new project, I will start with how I am setting up a new compile unit, e.g. a header and compile unit pair.

Note that I will try not to focus on a specific C++11 paradigm, such as object-oriented or imperative. This structure seems to work well for all kinds of paradigms. But without much further ado, here’s the header file for my imaginary “MyUnit” unit:

MyUnit.hpp

#pragma once

#include <vector>
#include "MyStuff.hpp"

namespace MyModule { namespace MyUnit {

/** Does something only a good bar could.
*/
std::vector<float> bar(int fooCount);

/** Foo is an integral part of any program.
    Be sure to call it frequently.
*/
void foo(MyStuff::BestType somethingGood);

}}

I prefer the .hpp file ending for headers. While I’m perfectly fine with .h, I think it is helpful to differentiate pure C headers from C++ headers.

#pragma once

I’m using #pragma once here instead of include guards. It is not an official part of the standard, but all the big compilers (Visual C++, g++ and clang) support it, making it a de-facto standard. Unlike include guards, you only have to add only one line, which says exactly what you want to achieve with it. You do not have to find a unique identifier for your include guard that will most certainly break if you rename the file/unit. It’s more readable, more resilient to change and easier to set up.

Namespaces

I like to have all the contents of a unit in a single namespace. The actual structure of the namespaces – i.e. per unit or per module or something else entirely depends on the specifics of the project, but filling more than one namespace is a guarantee for chaos. It’s usually a sign that the unit should be broken up into smaller pieces. An exception to this would be the infamous “detail” namespace, as seen in many of the Boost libraries. In that case, the namespace is not used to structure the API, but to explicitly omit things from the API that have to be visible for technical reasons.

Documentation

Documentation goes into the header, not into the implementation. The header describes the API, not only to the compiler, but also to humans. It is by no means an implementation detail, but part of the seam that isolates it from the rest of the code. Note that this part of the documentation concerns the API contract only, never the implementation. That part goes into the .cpp file.

But now to the implementation file:

MyUnit.cpp

#include "MyUnit.hpp"

#include "CoolFunctionality.hpp"

using namespace MyModule;
using namespace MyUnit;

namespace {

int helperFunction(float rhs)
{
  /* ... */
}

}// namespace

std::vector<float> MyUnit::bar(int fooCount)
{
  /* ... */
}

void MyUnit::foo(MyStuff::BestType somethingGood)
{
  /* ... */
}

Own #include first

The only rule I have for includes is that the unit’s own include is always the first. This is to test whether the header is self-sufficient, i.e. that it will compile without being in the context of other headers or, even worse, code from an implementation file. Some people like to order the rest of their includes according to their “origin”, e.g. sections for system headers or library headers. I think imposing any extra order here is not needed. If anything, I prefer not waste time sorting include directives and just append an include when I need it.

Using namespace

I choose using-directives of my unit’s namespaces over explicitly accessing the namespaces each time. Unlike the headers, the implementation file lives in a locally defined context. Therefore, it is not a problem to use a very specific view onto the unit. In fact, it would be a problem to be overly generic. The same argument also holds for other “local” modules that this unit is only using, as long as there are no collisions. I avoid using namespaces from external libraries to mark the library boundary (such as std, boost etc.).

Unnamed namespace

The unnamed namespace contains all the implementation helpers specific to this unit. It is quite common for this to contain a lot of the “meat” of an actual unit, while the unit’s visible functions merely wrap and canonize the functionality implemented here. I try to keep only one unnamed namespace in each file, to have a clear separation of what is supposed to be visible to the outside – and what is not.

Visible implementation

The implementation of the visible API of the module is the most obvious part of the .cpp file. For consistency reasons, the order of the functions should be the same as in the header.

I’d advice against implementing in a file wide open namespace. That means balancing an unnecessary pair of parenthesis over the whole implementation file.  Also, you can not only define functions and types, but also declare them – this leads to a function further down in the implementation to see a different namespace than one before it.

Conclusion

This concludes the first part. I’ve played with the thought of using a 3-piece setup instead, extending the header/implementation with a unit-test file, but have not gathered any sharable experience yet. This setup, however, has worked for me for a long time and with many different projects. Have you had similar – or completely different – setups that worked for you? Do tell!