Debug Output

Crafting debug output from std::istream data can be dangerous!

Writing a blog post sometimes can be useful to get some face-palm kind of programming error out of one’s system.

Putting such an error into written words then serves a couple of purposes:

  • it helps oneself remembering
  • it helps others who read it not to do the same thing
  • it serves as error log for future reference

So here it comes:

In one project we use JSON to serialize objects in order to send them over HTTP (we use the very nice JSON Spirit library, btw).

For each object we have serialize/deserialize methods which do the heavy lifting. After having developed a new deserialize method I wanted to test it together with the HTTP request handling. Using curl for this I issued a command like this:

curl -X PUT http://localhost:30222/some/url -d @datafile

This command issues a PUT request to the given URL and uses data in ./datafile, which contains the JSON, as request data.

The request came through but the deserializer wouldn’t do its work. WTF? Let’s see what goes on – let’s put some debug output in:

MyObject MyObjectSerializer::deserialize(std::istream& jsonIn)
{
   // debug output starts here
   std::string stringToDeserialize;
   Poco::StreamCopier::copyToString(jsonIn, stringToDeserialize);
   std::cout << "The String: " << stringToDeserialize << std::endl;
   // debug output ends here

   json_spirit::Value value;
   json_spirit::read(jsonIn, value);
   ...
}

I’ll give you some time to spot the bug…. 3..2..1..got it? Please check Poco::StreamCopier documentation if you are not familiar with POCO libraries.
What’s particularly misleading is the “Copier” part of the name StreamCopier, because it does not exactly copy the bytes from the stream into the string – it moves them. This means that after the debug output code, the istream is empty.

Unfortunately, I did not immediately recognize the change in the error outputs of the JSON parser. This might have given me a hint to the real problem. Instead, during the next half hour I searched for errors in the JSON I was sending.

When I finally realized it …

Embedding Python into C++

In one of our projects the requirement to run small user-defined Python scripts inside a C++ application arose. Thanks to Python’s C-API, nicknamed CPython, embedding (really) simple scripts is pretty straightforward:

Py_Initialize();
const char* pythonScript = "print 'Hello, world!'\n";
int result = PyRun_SimpleString(pythonScript);
Py_Finalize();

Yet, this approach does neither allow running extensive scripts, nor does it provide a way to exchange data between the application and the script. The result of this operation merely indicates whether the script was executed properly by returning 0, or -1 otherwise, e.g. if an exception was raised. To overcome these limitations, CPython offers another, more versatile way to execute scripts:

PyObject* PyRun_String(const char* pythonScript, int startToken, PyObject* globalDictionary, PyObject* localDictionary)

Besides the actual script, this function requires a start token, which should be set to Py_file_input for larger scripts, and two dictionaries containing the exchanged data:

PyObject* main = PyImport_AddModule("__main__");
PyObject* globalDictionary = PyModule_GetDict(main);
PyObject* localDictionary = PyDict_New();
PyObject* result = PyRun_String(pythonScript, Py_file_input, globalDictionary, localDictionary);

Communication between the application and the script is done by inserting entries to one of the dictionaries prior to running the script:

PyObject* value = PyString_FromString("some value");
PyDict_SetItemString(localDict, "someKey", value);

Doing so makes the variable “someKey” and its value available inside the Python script. Accessing the produced data after running the Python script is just as easy:

char* result = String_AsString(PyDict_GetItemString(localDict, "someKey"));

If a variable is created inside the Python script, this variable also becomes accessible from the application through PyDict_GetItemString (or PyDict_GetItem), even if it was not entered into the dictionary beforehand.

The following example shows the complete process of defining variables as dictionary entries, running a small script and retrieving the produced result in the C++ application:

Py_Initialize();
//create the dictionaries as shown above
const char* pythonScript = "result = multiplicand * multiplier\n";
PyDict_SetItemString(localDictionary, "multiplicand", PyInt_FromLong(2));
PyDict_SetItemString(localDictionary, "multiplier", PyInt_FromLong(5));
PyRun_String(pythonScript, Py_file_input, globalDictionary, localDictionary);
long result = PyInt_AsLong(PyDict_GetItemString(localDictionary, "result"));
cout << result << endl;
Py_Finalize();

The Great Divide

There is a great divide in the C++ developer community between “normal” developers that use only basic language features and very savvy ones that know every little corner of the language. The upcoming C++ standard deepens this divide even more.

Recently, I had two very contrary conversations about C++ which show very good the great divide in C++ developer community.

The first was with the technical lead of a team that writes and maintains drivers and control software for a scientific institution. These systems run 24/7 and have to be very stable and reliable.

I had discovered that they use a self-written toolbox library containing classes like SharedPtr<T>, and Thread and suspected immediately a classical NIH-syndrome. I asked him about it and why they don’t use well established libraries like boost. He told me that they indeed are only using the standard library and their own toolbox.

The reason he gave was that despite boost being most elegant C++ library out there, it required very good knowledge about the most advanced C++ mechanisms, and that his team was not on this level … I should probably mention here that his team does a very good job in running their systems. So, apparently, they get along very well with using only basic  C++ features and no “fancy” boost stuff.

The other conversation was with a friend of mine with whom I chat regularly about all sorts of programming related stuff. This time the topic was the upcoming  C++ standard and all its  exciting new stuff. He has lot’s of experience with C++ and knows the language very well. But even someone like him had a hard time to really understand what rvalue references are all about. I had not looked at them in detail, yet,  so he tried to explain them to me. During our discussion I was thinking about if teams like the one introduced before will ever use rvalue references, or other C++0X stuff in their production code, other than maybe the auto keyword for type inference, or constructor delegation.

Honestly, I don’t think stuff like  rvalue refs will become a feature that is often used by “standard industry” teams, because it adds a lot of complexity to an already complex language. Even easy-to-get stuff like the new keywords override, constexpr and final, or additional initialization means like std::initializer_list<T> will take a lot of time to get used regularly by most C++ teams.

Instead, most of C++0X will greatly increase the divide between “normal” C++ developers who get along well with using only basic language features, and experts that know every little corner of the language. And this is simply because there is so much more to know with C++0X.

But don’t let us paint this picture overly black. I, for one, am looking forward to the new standard and I will certainly spread the word about the new possibilities and features in every C++ team I work with.

Bogus Error Messages with Qt .ui Files

Name your Qt Forms correctly and you will save lots of debugging time.

Bogus errors together with their messages can have a large number of reasons – full hard drives being one of the classics. When it comes to programming and especially C++, the possibilities for cryptic, meaningless and misleading error message are infinite.

A nice one bit us at one of our customers the other day. The message was something like

QLayout can only have instances of QWidget as parent

and it appeared as standard error output during program start-up. Needless to say that the whole thing crashed with a segmentation fault after that. The only change that was made was a header file that was added to the Qt files list in the CMakeLists.txt file.  The Qt class in this header file was just in its beginnings and had not yet any QLayouts, or QWidgets in it. Even the  C++ standard measure of cleaning and recompiling everything didn’t help.

So how is it possible that an additional Qt header file that has not references to QLayout and QWidget can cause such an error message?

As all of you experienced C/C++ developers know, for the compiler, a code file is not only the stuff that it contains directly but also what is #included! The offending header file included a generated ui description file which you get when you design your windows – or Forms in Qt terminology – with the Qt designer and use the Compile-Time-Form-Processing-approach to incorporate the form into the code base.

But how can that effect anything?

The Qt designer saves the forms into .ui files. From that, the so-called User Interface Compiler (uic) generates a header file containing a C++ class together with inlined code that creates the form. Form components like line edits, or push buttons are generated as instance attributes. The name of the class is generated from the name of the form. You can even use namespaces.  By naming it e.g. myproject::BestFormEverDesigned the generated class is named BestFormEverDesigned   is put into namespace myproject.

So far, so nice, handy and easy to use.

When you create a new form in qt designer, the default name is Form. Maybe you can guess already where this leads to…

Two forms for which the respective developers forgot to set a proper name, existed in the same sub project and had been compiled and linked into the same shared library. The compiler has no chance to detect this, because it sees only one

class Form
{

at a time. The linker happily links all of this together since it thinks that all Forms are created equal. And then at run-time … Boom!

I will have to look into a little Jenkins helper which breaks the build when a Form form is checked in…

Looping in C++

What is “the best” way to loop over collections in C++?

One recurring discussion point in one of our customers C++ project team is the following:

What is “the best” way to loop over collections?

In a typical scenario there is a standard container like std::list, or some equivalent collection, and the task is to do something with every element in the collection. The straight forward way would be like this:

std::list<std::string> mylist;
for (std::list<std::string>::iterator iter = mylist.begin(); iter != mylist.end(); iter++)
{
   ...
}

This code is correct and readable. But my guess is that most of you instantly see at least two possible improvements:

  1. the call to mylist.end() occurs in every loop an can be expensive e.g. in case of long std::lists
  2. iter++ creates one unnecessary intermediate object on the stack

So this

for (std::list<std::string>::iterator iter = mylist.begin(), end = mylist.end(); iter != end; ++iter)
{
   ...
}

would be much better but can already be seen as a little less readable.

Using BOOST_FOREACH can save you much of this still tedious code but has one nasty pitfall when it comes to std::maps.

In some places of the code base std::for_each is used together with a function, or function object.  The downside of this is that the function/function object code is not located where the loop occurs. However, this can be made “readable enough” when the function, or function object does only one thing and has a telling name.

Looping is sometimes done to create other collections of objects for each element. What to do there? Define the new collection use a for-loop of BOOST_FOREACH like above, or use std::transform with the same downside as std::for_each?

The other day one team member suggested to use boost::lambda expressions in loops. The initial usage examples where very promising but let me tell you – readability can drop dramatically very fast if you don’t be careful. It is very easy to get carried away with boost’s lambdas. I happened that we found ourselves having spent the last hour to carve out a super crisp lambda expression that takes anybody else another hour to read.

So the initial question remains undecided and will most likely stay like that. As for everything else in programming, there doesn’t seem to be a silver bullet for this task.

How do you go about looping in C++? Do you have some kind of coding style in place? Do you use std::for_each, BOOST_FOREACH, or some other means?

Looking forward to some feedback.

SSL with POCO

A short introduction to using SSL support in POCO C++ libraries.

Admittedly, the topic of this post is very specific but I hope it will still be of some value for some people.The task for today is to setup SSL server and client with POCO framework classes. I will leave out the whole certificate managing issues and just assume that the right files are at hand.

The SSL related part of  the POCO libraries essentially wraps the OpenSSL library into a nice object-oriented interface. When you know OpenSSL, you can instantly relate to classes like Poco::Net::Context, or the …Handler classes (if you replace “handler” with “callback”).

“SSL” stands for Secure Socket Layer, so the first thing to discover is class Poco::Net::SecureServerSocket. As you would expect, this class is derived from Poco::Net::ServerSocket, extending it only with SSL related stuff. And sure enough, some constructors of Poco::Net::ServerSocket take a Poco::Net::ContextPtr as argument.

But why only some constructors? Since there is no setContext method, there must be some other mechanism in place by which SecureServerSockets get their SSL context.

Introducing Poco::Net::SSLManager. From the API docs:

SSLManager is a singleton for holding the default server/client Context and handling callbacks for certificate verification errors and private key passphrases.

Proper initialization of SSLManager is critical.

Aha! So all the constructors of SecureServerSocket that do not take Context pointers simply get it from the SSLManager singleton.

But how to initialize SSLManager?

1. The POCO Way:

If you developed your application with POCO from the ground up there probably exists a sub-class of Poco::Application, and all the configuration is handled by the built-in configuration classes.

With this in place, all you have to do is to add the proper ssl configuration elements:

openSSL.server.privateKeyFile = /path/to/key/file
openSSL.server.certificateFile = /path/to/certificate/file
openSSL.server.verificationMode = none
openSSL.server.verificationDepth = 9
openSSL.server.loadDefaultCAFile = false
openSSL.server.cypherList = ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH
openSSL.server.privateKeyPassphraseHandler.name = KeyFileHandler
openSSL.server.privateKeyPassphraseHandler.options.password = securePassword
openSSL.server.invalidCertificateHandler = AcceptCertificateHandler

2. Manually:

Depending on which side you are – client or server – you have to call SSLManager::initializeClient or  SSLManager::initializeServer. Both methods take three arguments:

  1. PrivateKeyPassphraseHandler pointer
  2. InvalidCertificateHandler pointer
  3. Context pointer

This is where it becomes a little bit tricky: If you try to instantiate a Context with a privateKey file in order to provide it as argument to the initialize… method, a PrivateKeyPassphraseHandler might be needed. This handler is fetched from the SSLManager singleton – which you are just about to initialize!.

This circular dependency between Context and SSLManager can be overcome e.g. if you call SSLManager::initializeServer first only with a PrivateKeyPassphraseHandler, a InvalidCertificateHandler and null Context pointer. Then instantiate the Context and call SSLManager::initializeServer again.

Now that SSL Manager is initialized we can use Secure… prefixed classes as we would used their non-SSL counterparts. As with SecureServerSocket, other Secure… classes are derieved from corresponding non-secure base classes.

Conclusion: Once you got around the initialization of SSLManager singelton, using SSL POCO classes is very easy and straight forward. Check it out!

Active Object with POCO’s Active Methods

POCO’s ActiveMethods require minimal additional code to implement the Active Object design pattern

Active Object is a well known design pattern for synchronizing access to an object and/or resource. The basic idea is to separate method invocation from method execution which is done in a dedicated thread.

Instead of using the objects interface directly, a client of an Active Object uses some kind of  proxy which enqueues a so-called Method Request for later execution. The proxy finishes immediately and returns to the client some sort of callback, or variable, by which the client can receive the result. These intermeditate result variables are also known as Futures.

As always, there are lots of ways to implement this pattern. For example, if you had an interface like this

class MyObject
{
  public:
    int doStuff(const std::string& param) =0;
    std::string doSomeOtherThing(int i) =0;
};

applying a straight forward implementation, you would first transform this into a proxy and method request classes:

class MyObjectProxy
{
  public:
    MyObjectProxy(MyObject* theObject);
    // proxy methods
    Future<int> doStuff(const std::string& param);
    Future<std::string> doSomeOtherThing(int i);
  private:
    MyObject * _myObject;
};

class MethodRequest_DoStuff :
  public AbstractMethodRequest
{
  public:
    MethodRequest_DoStuff(const std::string& param);
    // all method request classes must implement execute()
    virtual void execute(MyObject* theObject);

  private:
    const std::string _param;
};

… and so on (for more details see this basic paper by Douglas C. Schmidt, or read chapter Concurrency Patterns in POSA2).

It’s easy to see that this implementation produces a lot of boilerplate code. To solve this, you could either cook up some code generation, or look for some language support to reduce the amount of characters you have to type. In C++, some sort of template solution can be the way to go, or…

Introducing Active Methods

With class ActiveMethod together with support classes ActiveDispatcher and ActiveResult the POCO C++ libraries provide very simple and elegant building blocks for implementing  the Active Object pattern.

ActiveMethod:  this is the core piece. When called, an ActiveMethod executes in its own thread.

ActiveResult: this is what I referred to earlier as a Future. Instances of ActiveResult are used to pass the result of an ActiveMethod call back to the client.

ActiveDispatcher: if you only use ActiveMethods, every ActiveMethod thread can execute in parallel.  With ActiveDispatcher as base class, ActiveMethod calls are serialized, thus implementing real™ Active Object behaviour.

Here my earlier example using ActiveMethods:

class MyObject
{
  public:
    // ActiveMethods are initialized in the ctors
    // initializer list
    MyObject()
      : doStuff(this, &MyObject::doStuffImpl),
        doSomeOtherThing(this, &MyObject::doSomeOtherThingImpl)
    {}

    ActiveMethod<int, std::string, MyObject> doStuff;
    ActiveMethod<std::string, int, MyObject> doSomeOtherThing;
  private:
    int doStuffImpl(const std::string& param);
    std::string doSomeOtherThingImpl(int i);
};

This is used as follows:

MyObject myObject;
ActiveResult<std::string> result = myObject.doSomeOtherThing(42);
...
result.wait();
std::cout << result.data() << std::endl;

This solution requires minimal amounts of additional code to transform your lame and boring normal object into a full-fledged Active Object. The only downside is that Active Methods currently can only have one parameter. If you need more, you have to use tuples or parameter objects.

Have fun!

Shrink your dependency list with POCO

POCO is a nice set of C++ libraries which provides elegant solutions for day-to-day tasks.

When you write C++ applications of any sort you are very likely to need support libraries in addition to what comes with C++ (which is not much, btw). Of course, this holds true for any other language. But with Java and its rich JDK for example this need is not so imminent.

Starting at the very beginning, let’s see how fast the need for support arises.

int main(int argc, char** argv)
{
// parsing command line arguments
...

How to parse those command line arguments in a simple and easy way? How about a little help output when the program is called with -h or –help? Ok, we got boost::program_options for this.

Going further in your program you may want to have some sort of logging capability. Unfortunately, as of boost version 1.45 there is nothing to be found there. So you add a nice logging library.

And so on.

But wait! You don’t want to depend on too many 3rd party libraries because, among other things, they add deployment complexity.

Not even Qt, as one of the major players in the C++ framework world, provides solutions to both previous examples. As of version 4.7, no logging and not much support with command line arguments. And you end-up having to use QString, one of many non-std::string classes in C++ frameworks, which can get annoying at times (of course there are reasons why those exist).

I could go on with the list of smaller or larger concerns for which you either roll your own implementation or include yet another library in your project.

Instead I would like to point you to POCO, a nice set of C++ libraries which provide easy solutions for many basic and/or advanced day-to-day tasks. From their website:

Modern, powerful open source C++ class libraries and frameworks for building network- and internet-based applications that run on desktop, server and embedded systems

Besides very basic stuff like logging, date/time handling, threads, memory management, UTF-8, etc. they also provide lots of higher level classes for things like SMTP, POP3, SQL database access and HTTP. They even have a so called C++ Server Page Compiler which is basically something like JSP or Active Server Pages.

And they have no own string class! Yay! Instead they provide lots of functions classes and streams to do string manuipulation on good old std::string.

One thing I like most about POCO, though, is its clean, well-documented and apparently very high quality code. Although it is not overly functional or template-heavy, like you see it in in boost very often, it still provides elegant solutions.

Check it out and shrink your dependency list.

Bug hunting fun with std::sort

Small errors in custom comparison functions used with std::sort can lead to hard-to-find bugs.

The other day I came across a nice little C++-shoot-yourself-in-the-foot at one of our customers. Let’s see how fast you can spot the problem. The following code crashes with segmentation fault sometime, somewhere in the sort call (line 31).

#include <iostream>
#include <vector>
#include <boost/shared_ptr.hpp>
#include <boost/bind.hpp>

using namespace std;
using namespace boost;

enum SORT_ORDER
{
  SORT_ORDER_ASCENDING,
  SORT_ORDER_DESCENDING
};

bool compareValues(const std::string& valueLeft,
                   const std::string& valueRight,
                   SORT_ORDER order)
{
  const bool compareResult = (valueLeft < valueRight);
  if (order == SORT_ORDER_DESCENDING) {
    return !compareResult;
  }
  return compareResult;
}

int main(int argc, char *argv[])
{
  std::vector<std::string> strValues(300);
  std::fill(strValues.begin(), strValues.end(),
            "Hallo");
  std::sort(strValues.begin(), strValues.end(),
            bind(compareValues, _1, _2, SORT_ORDER_DESCENDING));
  return EXIT_SUCCESS;
}

Any ideas? The tricky thing about this bug is that the stacktrace output in the debugger gives absolutely no hint at all about its cause. And this is a simplified version of the real code which has to sort boost::shared_ptrs instead of strings. Believe me, you don’t want to see that stacktrace. Because of the use of boost::bind together with boost::shared_ptrs it looks, well, let’s say intimidating.

Still no idea?

I’ll give you a hint. If the SORT_ORDER is set to SORT_ORDER_ASCENDING everything is fine. …

Ok, the problem is that std::sort algorithm must be given a comparison function (object) that defines a strict weak ordering on the elements that are to be sorted. In other words the comparison function object must implement the ‘<‘ (less than) relationship on the elements.

Unfortunately, lines 20 to 22 break this ordering when SORT_ORDER_DESCENDING is given. The initial idea of this code was that, well, if compareResult gets returned on ascending sort order, lets just return the negation of it when the “negation” of acscending order is requested. This, of course, destroys the strict weak ordering requirement because whenever valueLeft == valueRight, the function returns true, meaning instead that valueLeft < valueRight. And this somehow wreaks havoc inside std::sort.

A better version of the function could be:

...
bool compareValues(const std::string& valueLeft,
                   const std::string& valueRight,
                   SORT_ORDER order)
{
  // solution: return false independent of sort order
  // whenever valueLeft == valueRight
  if (valueLeft == valueRight) {
    return false;
  }
  const bool compareResult = (valueLeft < valueRight);
  if (order == SORT_ORDER_DESCENDING) {
    return !compareResult;
  }
  return compareResult;
}
...

The really annoying thing about this whole issue is that std::sort just randomly crashes with a stack trace that shows nothing but some weird memory corruption going on. After the initial shock, this sends you down the complete wrong bug hunting road where you start looking for spots where memory could be overwritten or the like.

So beware of custom comparison functions or function objects. They might look innocent and easy, but they can give you lot’s of headaches.

Responsive Qt GUIs – Threading with Qt

Qt4 used to have only primitive threading support. Starting with version 4.4 new classes and functions makes your threading life a lot easier. So in case you haven’t come around to look at those features, do it now!, it’s worth it.

If you have used Qt4 for some time now, specifically since pre 4.4 versions, you may or may not aware of the latest developments in the threading part of the library. This post shall be a reminder in case you didn’t follow the versions in detail or just didn’t get around to look closer and/or update.

In pre 4.4 versions, the only way to do threading was to use class QThread. Subclass QThread, implement the run method, and there you had your thread. This looks fine at first, but, taking the title of the post as example, it can get annoying very fast. Sometimes you have just few lines of code you want to keep away from the GUI thread because, e.g. they could potentially block on some communication socket. Subclassing QThread for every small little work package is not something you want to do, so I guess many users just wrote their own thread pool or the like.

Starting with version4.4. Qt gained two major threading features, for which, IMHO, the Qt people do not a very good job of advertising. The first is QThreadPool together with QRunnable. All Java programmers, which use java.lang.Runnable since the beginning, may have their laugh now, I’ll wait…

The second new threading feature is the QtConcurrent namespace (from the Qt documentation):

The QtConcurrent namespace provides high-level APIs that make it possible to write multi-threaded programs without using low-level threading primitives such as mutexes, read-write locks, wait conditions, or semaphores

Sounds great! What else?

QtConcurrent includes functional programming style APIs for parallel list prosessing, including a MapReduce and FilterReduce implementation for shared-memory (non-distributed) systems, and classes for managing asynchronous computations in GUI applications.

This is really great stuff. Functions like QtConcurrent::run together with QFuture<T> and QFutureWatcher<T> can simplify your threading code significantly. So, if you haven’t got around to look at those new classes by now, I can only advise you to do it immediately. Allocate a refactoring slot in your next Sprint to replace all those old QThread sub-classes by shiny new QRunnables or QtConcurrent functions. It’s worth it!

Let’s get back to the responsive GUIs example. In his Qt Quarterly article, Witold Wysota describes in detail every technical possibility to keep your GUI responsive. It’s a very good article which provides a lot of insights. He starts with manual event processing and mentions the QtConcurrent features only at the very end of the article. I suggest the following order of threading-solutions-to-consider:

  1. QtConcurrent functions
  2. QThreadPool + QRunnable
  3. rest

Stay responsive!