SSL with POCO

A short introduction to using SSL support in POCO C++ libraries.

Admittedly, the topic of this post is very specific but I hope it will still be of some value for some people.The task for today is to setup SSL server and client with POCO framework classes. I will leave out the whole certificate managing issues and just assume that the right files are at hand.

The SSL related part of  the POCO libraries essentially wraps the OpenSSL library into a nice object-oriented interface. When you know OpenSSL, you can instantly relate to classes like Poco::Net::Context, or the …Handler classes (if you replace “handler” with “callback”).

“SSL” stands for Secure Socket Layer, so the first thing to discover is class Poco::Net::SecureServerSocket. As you would expect, this class is derived from Poco::Net::ServerSocket, extending it only with SSL related stuff. And sure enough, some constructors of Poco::Net::ServerSocket take a Poco::Net::ContextPtr as argument.

But why only some constructors? Since there is no setContext method, there must be some other mechanism in place by which SecureServerSockets get their SSL context.

Introducing Poco::Net::SSLManager. From the API docs:

SSLManager is a singleton for holding the default server/client Context and handling callbacks for certificate verification errors and private key passphrases.

Proper initialization of SSLManager is critical.

Aha! So all the constructors of SecureServerSocket that do not take Context pointers simply get it from the SSLManager singleton.

But how to initialize SSLManager?

1. The POCO Way:

If you developed your application with POCO from the ground up there probably exists a sub-class of Poco::Application, and all the configuration is handled by the built-in configuration classes.

With this in place, all you have to do is to add the proper ssl configuration elements:

openSSL.server.privateKeyFile = /path/to/key/file
openSSL.server.certificateFile = /path/to/certificate/file
openSSL.server.verificationMode = none
openSSL.server.verificationDepth = 9
openSSL.server.loadDefaultCAFile = false
openSSL.server.cypherList = ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH
openSSL.server.privateKeyPassphraseHandler.name = KeyFileHandler
openSSL.server.privateKeyPassphraseHandler.options.password = securePassword
openSSL.server.invalidCertificateHandler = AcceptCertificateHandler

2. Manually:

Depending on which side you are – client or server – you have to call SSLManager::initializeClient or  SSLManager::initializeServer. Both methods take three arguments:

  1. PrivateKeyPassphraseHandler pointer
  2. InvalidCertificateHandler pointer
  3. Context pointer

This is where it becomes a little bit tricky: If you try to instantiate a Context with a privateKey file in order to provide it as argument to the initialize… method, a PrivateKeyPassphraseHandler might be needed. This handler is fetched from the SSLManager singleton – which you are just about to initialize!.

This circular dependency between Context and SSLManager can be overcome e.g. if you call SSLManager::initializeServer first only with a PrivateKeyPassphraseHandler, a InvalidCertificateHandler and null Context pointer. Then instantiate the Context and call SSLManager::initializeServer again.

Now that SSL Manager is initialized we can use Secure… prefixed classes as we would used their non-SSL counterparts. As with SecureServerSocket, other Secure… classes are derieved from corresponding non-secure base classes.

Conclusion: Once you got around the initialization of SSLManager singelton, using SSL POCO classes is very easy and straight forward. Check it out!

Information Hiding in Source Code Comments

Use comments for additional details, not as an issue tracker.

Once in a while you come across a part of code needs to be fixed. You put in a comment saying something like ‘FIXME’ or ‘BROKEN’ or just another ‘TODO’. Sure you have no time, you need to ship. You don’t even know how much work it is to fix it or how many tests break after your modifications. A comment is safe and clear.
After a while the code around it changes. The software is shipped and used. The comment is long forgotten.
One day a customer calls and reports about an incident which sounds like a bug. After minutes or hours debugging and finally pinning the bug you rediscover a long lost information:

// FIXME: shouldn't we do this here?

So next time you find a code part which looks strange or seems to have a bug, create an issue and describe why you think it does not do what it should. And even better: write a test and fix it.

A tale of scrap metal code – Part II

The second part of a series about the analysis of a software product. This part investigates the source code and reveals some ugly practices therein.

In the first part of this tale about an examined software project, I described the initial situation and high-level observations about the project. This part will dive into the actual source code and hopefully reveal some insights. The third and last part will summarize everything and give some hints on how to avoid creating scrap metal code.

About the project

If you want to know more about the project, read the first part of this tale. In short, the project looked like a normal Java software, but unfolded into a nightmare, lacking basic requirements like tests, dependency management or continuity.

About the developer

The developer has a job title as a “senior developer”. He developed the whole project alone and wrote every line of code. From the code, you can tell his initial uncertainty, his learning progress, some adventurous experiments and throughout every file, a general uneasiness with the whole situation. The developer actively abandoned the project after three years of steady development. From what I’ve seen, I wouldn’t call him a “senior” developer at all.

About the code

The code didn’t look very repellent at first sight. But everywhere you looked, there was something to add on the “TODO list”. Let me show you our most prominent findings:

Unassigned constructors

The whole code was littered with constructor calls that don’t store the returned new object. What’s the point in constructing another instance of you throw it away in the next moment, without ever using it? After examining these constructors, it became apparent that they only exist to perform side effects each. The new object is registered with the global data model while it’s still under construction. It was the most dreadful application of the Monostate design pattern I’ve ever seen.

Global data model

Did I just mention the global data model? At the end of the investigation, we found that the whole application state lives in numerous public static arrays, collections or maps. These data fields are accessible from everywhere in the application and altered without any protection against concurrent modifications. These global variables were placed anywhere, without necessarily being semantically associated with their enclosing class. A data model in the sense of some objects being tied together to form an instance net with higher-level structures could not be found. Instead, different lookup structures like index-based arrays and key-based maps are associated by shared keys or obscure indices. The whole arrangement of the different data pieces is implicit, you have to parse the code for every usage. Mind you, these fields are globally accessible.

Manual loop unrolling

Some methods had several hundred lines of the exact same method call over and over again. This is what your compiler does when it unrolls your foreach loops. In this code, the compiler didn’t need to optimize. To add some myth, the n-th call usually had a slight deviation from the pattern without any explanation. Whenever something could easily be repeated, the developer pasted it all over the place. Just by winding up the direct repititions again, the code migth shrink by one quarter in length.

Least possible granularity

Just by skimming over the code, you’d discover plenty of opportunities to extract methods, raising the level of abstraction in the code. The developer chose to stick with the least possible granularity, making each non-trivial code a pain to read. The GUI-related classes, using Swing, were so bloated by trivialities that even a simple dialog with two text fields and one button was represented by a massive amount of code. Sadly, the code was clearly written by hand because of all the mistakes and pattern deviations. If the code had to deal with complex data types like dates, the developer always converted them to primitive data types like int, double or long and performed the necessary logic using basic math operators.

“This code is single-threaded, right?”

Despite being a Java Swing application, the code lacked any strategy to deal with multithreading other than ignoring the fact that at least two threads would access the code. We didn’t follow this investigation path down to its probably bitter end, but we wouldn’t be surprised if the GUI would freeze up occasionally.

“Exceptions don’t happen here”

If you would run a poll on the most popular exception handling strategy for this code, it would be the classic “local catch’n’ignore”. The developer dismissed the fact that exceptions might happen and just carried on. If he was forced to catch an exception, the catch block followed immediately and was empty in most cases. Of course, the only caught exception type was the Exception class itself.

“This might be null

One recurring pattern of the developer was a constructor call, stored in a local variable and immediate null check. Look at this code sample:

try {
    SomeObject object = new SomeObject();
    if (object != null) {
        object.callMethod();
    }
    [...]
} catch (Exception e) {
}

There is no possibility (that I know of) of object being null directly after the constructor call. If an exception is thrown in the constructor, the next lines won’t be executed. This code pattern was so prevalent in the code that it couldn’t be an accidental leftover of previous code. The accompanying effect were random null checks for used variables and return values.

Destabilized dependencies

If there is one thing that’s capable of derailing every code reader, analysis tool and justified guess, it’s wildcard use of Java’s reflection capabilities. The code for this project incorporates several dozens calls to Class.forName(), basically opening up the application for any code you want to dynamically include. The class names result from obscure string manipulation magic or straight from configuration. It’s like the evil brother of dependency injection.

Himalaya indentation

Looking at the indentation depths of the code, this wasn’t the worst I’ve ever seen. But that doesn’t mean it was pleasant. Like in Uncle Bob’s infamous “A crap odyssey”, you could navigate some classes by whitespace landscape. “Scroll down to the fifth crest, the vast valley afterwards contains the detail you want to know”.

Magic numbers

The code was impregnated with obscure numbers (like 9, 17, 23) and even more bizzare textual constants (like “V_TI_LB_GUE_AB”) that just appeared out of nowhere several times. This got so bad that the original author included lengthy comment sections on top of the biggest methods to list the most prominent numbers alongside their meanings. Converting the numbers to named constants would probably dispel the unicorns, as we all know that unicorns solely live on magic numbers(*). Any other explanation escapes my mind.

(*) On a side note, I call overly complex methods with magic numbers “unicorn traps”, as the unicorns will be attracted by the numbers and then inevitably tangled up in the complexity as they try to make their way out of the mess.

Summary

This was the list of the most dreaded findings in the source code. Given enough time, you can fix all of them. But it will be a long and painful process for the developer and an expensive investment for the stakeholder.

To give you an overall impression of the code quality, here is a picture of the project’s CrapMap. The red rectangles represent code areas (methods) that need improvements (the bigger and brighter, the more work it will take). The green areas are the “okay” areas of the project. Do you see the dark red cluster just right the middle? These are nearly a hundred complex methods with subtle differences waiting to be refactored.

Prospectus

In part three, I’ll try to extract some hints from this project on how to avoid similar code bases. Stay tuned.

A tale of scrap metal code – Part I

The first part of a series about the analysis of a software product. This part investigates some aspects of general importance and works out how they are failed.

This is the beginning of a long tale about an examined software project. It is too long to tell in one blog post, so I cut it in three parts. The first part will describe the initial situation and high-level observations. The second part will dive deep into the actual source code and reveal some insights from there. The third part wraps everything together and gives some hints on how to avoid being examined with such a negative result.

First contact

We made contact with a software product, lets call it “the application”, that was open for adoption. The original author wanted to get rid of it, yet it was a profitable asset. Some circumstances in this tale are altered to conceal and protect the affected parties, but everything else is real, especially on the technical level.

You can imagine the application as being the coded equivalent to a decommissioned aircraft carrier (coincidentally, the british Royal Navy tries to sell their HMS Invincible right now). It’s still impressive and has its price, but it will take effort and time to turn it around. This tale tells you about our journey to estimate the value that is buried in the coded equivalent of old rusted steel, hence the name “scrap metal code” (and this entry’s picture).

Basic fact

Some basic facts about the application: The software product is used by many customers that need it on a daily basis. It is developed in plain Java for at least three years by a single developer. The whole project is partitioned in 6 subprojects with references to each other. There are about 650 classes with a total of 4.5k methods, consisting of 85k lines of code. There are only a dozen third-party dependencies to mostly internal libraries. Each project has an ant build script to create a deployable artifact without IDE interference. On this level, the project seems rather nice and innocent. You’ll soon discover that this isn’t the truth.

Deeper look

Read the last paragraph again and look out for anything that might alert you about the fives major failures that I’m about to describe. In fact, the whole paragraph contains nothing else but a warning. We will look at five aspects of the project in detail: continuity, modularization, size, dependencies and build process. And we won’t discover much to keep us happy. The last paragraph is the upmost happiness you can get from that project.

Feature continuity

You’ve already guessed it: Not a single test. No unit test, zero integration tests and no acceptance test other than manually clicking through the application guided by the user manual (which we only hoped would exist somewhere). No persisted developer documentation other than generated APIDoc, in which the only human-written entries were abbreviated domain specific technical terms. We could also only hope that there is a bug tracker in use or else the whole project history would be documented in a few scrambled commit messages from the SCM (one thing done right!).
The whole project was an equally distributed change risk. The next part will describe some of the inherent design flaws that prohibited changes from having only local effects. Every feature could possibly interact with every other piece of code and would probably do so if you keep trying long enough.
It’s no use ranting about something that isn’t there. Safety measures to ensure the continuity of development on the application just weren’t there. FAIL!

Project modularization

The six modules are mostly independent, but have references to types in other modules (mostly through normal java imports). This would not cause any trouble, if the structure of the references was hierarchical, with one module on top and other modules only referencing moduls “higher” in the hierarchy. Sadly, this isn’t the case, as there is a direct circular dependency between two modules. You can almost see the clear hierarchical approach that got busted on a single incident, ruining the overall architecture. You cannot use Eclipse’s “project dependencies” anymore, but have to manually import “external class folders” for all projects now. The developer has forsaken the clean and well supported approach for a supposable short-term achievement, when he needed class A of module X in the context of module Y and didn’t mind the extra effort to think about a refactoring of the type and package structure. What could have been some clicks in your IDE (or an automatic configuration) will now take some time to figure out where to import which external folder and what to rebuild first because of the cycle. FAIL!

Code size

The project isn’t giantic. Let’s do some math to triangulate our expectations a bit. One developer worked for three years to pour out nearly 90k lines of code (with build scripts and the other stuff included). That’s about 30k lines per year, which is an impressive output. He managed to stuff these lines in 650 classes, so the average class has a line count of 130 lines of code. Doesn’t fit on a screen, but nothing scary yet. If you distribute the code evenly over the methods, it’s 19 lines of code per method (and 7 methods per class). Well, there I get nervous: twenty lines of code in every method of the system is a whole lot of complexity. If a third of them are getter and setter methods, the line count rises to an average of 26 lines per method. I don’t want every constructor i have to use to contain thirty lines of code!

To be sure what code complexity we are talking about, we ran some analysis tools like JDepend or Crap4j. The data from Crap4j is very explicit, as it categorizes each method into “crappy” or “not crappy”, based on complexity and test coverage (not given here). We had over 14 percent crappy methods, in absolute numbers roughly 650 crappy methods. That is one crappy method per class. The default percentage gamut of Crap4j ends at 15 percent, the bar turns red (bad!) over 5 percent. So this code is right at the edge of insanity in terms of accumulated complexity. If you want to know more about this, look forward to the next parts of this series.

Using the CrapMap, we could visualize the numerical data to get an overview if the complexity is restricted to certain parts of the application. You can review the result as a picture here. Every cell represents a method, the green ones are okay while the red ones are not. The cell size represent the actual complexity of the method. As you can see, the “overly complex code syndrome” is typically for virtually all the code. Whenever a method isn’t a getter or setter (the really tiny dark green square cells), it’s mostly too complex. Additional numbers we get from the Crap4j metric are “Crap” and “Crap Load”, stating the amount of “work” necessary to tame a code base. Both values are very high given the class and method count.

All the numbers indicate that the code base is bloated, therefore constantly using the wrong abstraction level. Applying non-local changes to this code will require a lot of effort and discipline from an experienced developer. FAIL!

Third-party dependencies

The project doesn’t use any advanced mechanism of dependency resolution (like maven or ant ivy). All libraries are provided alongside the source code. This isn’t the worst option, given the lack of documentation.
A quick search for “*.jar” retrieves only a dozen files in all six modules. That’s surprisingly less for a project of this size. Further investigation shows some inconvenient facts:

  • Some of these libraries are published under commercial licenses. This cannot always be avoided, but it’s an issue if the project should be adopted.
  • Most libraries provide no version information. At least a manifest entry or an appended version number in the filename would help a lot.
  • Some libraries are included multiple times. They are present for every module on their own, just waiting to get out of sync. With one library, this has already happened. It’s now up to the actual classpath entry order on the user’s machine how this software will behave. The (admittedly non-present) unit tests would not safeguard against the real dependency, but the local version of the library, which could be newer or older.

As there is no documentation about the dependencies, we can only guess about their scope: Maybe the classes are required at compile time but optional at runtime? The best bet is to start with the full set and accept another todo entry on the technical debt list. FAIL!

Build scripts

But wait, for every module, there is a build script. A quick glance shows that there are in fact four build scripts for every module. All of them are very similar with minor differences like which configuration file gets included and what directory to use for a specific fileset. Nothing some build script configuration files couldn’t have handled. Now we have two dozen build scripts that all look suspiciously copy&pasted. Running one reveals the next problem: All these files contain absolute paths, as if the “works on my machine award” was still looking for a winner. When we adjusted the entries, the build went successful. The build script we had to change was a messy collection of copied code snippets (if you want to call ant’s XML dialect “code”). You could tell by the different formatting, naming and solution finding styles. But besides being horribly mangled, the build included code obfuscation and other advanced topics. Applied to the project, it guaranteed that no stacktrace from any user would ever contain useful information for anybody, including the project’s developers. FAIL!

Summary

Lets face the facts: The project behind the application fails on every aspect except delivering value to the current customers. While the latter is the most important ingredient of a successful project, it cannot be the only one and is only sustainable for a short period of time. The project suffers from the lonely superhero syndrome: one programmer knows everything (and can defend every design decision, even the ridiculous ones) and has no incentive to persist this knowledge. And the project will soon suffer from the truck factor: The superhero programmer will not be available anymore soon.

Prospectus

There are a lot of take-away lessons from this project, but I have to delay them until part three. In the next part, we’ll discover the inner mechanics and flaws of the code base.

Groovy isn’t a superset of Java

Groovy is Java with sugar, right?

Coming from Java to Groovy and seeing that Groovy looks like Java with sugar, you are tempted to write code like this:

  private String take(List list) {
    return 'a list'
  }

  private String take(String s) {
    return 'a string'
  }

But when you call this method take with null you get strange results:

  public void testDispatch() {
    String s = null
    assertEquals('a string', take(s)) // fails!
  }

It fails because Groovy does not use the declared types. Since it is a dynamic typed language it uses the runtime type which is NullObject and calls the first found method!
So when using your old Java style to write code in Groovy beware that you are writing in a dynamic environment!
Lesson learned: learn the language, don’t assume it behaves in the same way like a language you know even when the syntax looks (almost) the same.

Avoid switch! Use enum!

Recently I was about to refactor some code Crap4j pointed me to. When I realized most of that code was some kind of switch-case or if-else-cascade, I remembered Daniel´s post and decided to obey those four rules.
This post is supposed to give some inspiration on how to get rid of code like:

switch (value) {
  case SOME_CONSTANT:
    //do something
    break;
  case SOME_OTHER_CONSTANT:
    //do something else
    break;
  ...
  default:
    //do something totally different
    break;
}

or an equivalent if-else-cascade.

In a first step, let’s assume the constants used above are some kind of enum you created. For example:

public enum Status {
  ACTIVE,
  INACTIVE,
  UNKNOWN;
}

the switch-case would then most probably look like:

switch (getState()) {
  case INACTIVE:
    //do something
    break;
  case ACTIVE:
    //do something else
    break;
  case UNKNOWN:
    //do something totally different
    break;
}

In this case you don’t need the switch-case at all. Instead, you can tell the enum to do all the work:

public enum Status {
  INACTIVE {
    public void doSomething() {
      //do something
    }
  },
  ACTIVE {
    public void doSomething() {
      //do something else
    }
  },
  UNKNOWN {
    public void doSomething() {
      //do something totally different
    }
  };

  public abstract void doSomething();
}

The switch-case then shrinks to:

getState().doSomething();

But what if the constants are defined by some third-party code? Let’s adapt the example above to match this scenario:

public static final int INACTIVE = 0;
public static final int ACTIVE = 1;
public static final int UNKNOWN = 2;

Which would result in a switch-case very similar to the one above and again, you don’t need it. All you need to do is:

Status.values()[getState()].doSomething();

Regarding this case there is a small stumbling block, which you have to pay attention to. Enum.values() returns an Array containing the elements in the order they are defined, so make sure that order accords to the one of the constants. Furthermore ensure that you do not run into an ArrayOutOfBoundsException. Hint: Time to add a test.

There is yet another case that may occur. Let’s pretend you encounter some constants that aren’t as nicely ordered as the ones above:

public static final int INACTIVE = 4;
public static final int ACTIVE = 7;
public static final int UNKNOWN = 12;

To cover this case, you need to alter the enum to look something like:

public enum Status {
  INACTIVE(4),
  ACTIVE(7),
  UNKNOWN(12);

  private int state;

  public static Status getStatusFor(int desired) {
    for (Status status : values()) {
      if (desired == status.state) {
        return status;
      }
    }
    //perform error handling here
    //e.g. throw an exception or return UNKNOWN
  }
}

Even though this introduces an if (uh oh, didn’t obey rule #4), it still looks much more appealing to me than a switch-case or if-else-cascade would. Hint: Time to add another test.

How do you feel about this technique? Got good or bad experiences using it?

Code Camp Experiences II

A review of our first company code camp using Code Retreats like Corey Haines would do. Short summary: It was a lot of fun and we learned a lot. Go try it out yourself!

Last friday, we held a Code Camp instead of an Open Source Love Day (OSLD). We reserved a whole day for the company to pratice together and share our abilities on the coding level. While this usually already happens every now and then with pair programming sessions, this time we all worked on the same assignment and could compare our experiences. And this comparability worked great for us. This article tries to summarize our setup and the outcome of the Code Camp

Setup of the Code Camp

We tried to imitate a typical Code Retreat day in the manner of Corey Haines. If you haven’t heard about Code Retreats, Corey or the software craftsmanship idea, you could read about it in the links. The presentation of Corey at the QCon conference about software craftsmanship is also a valuable watch.

There are some resources on the internet about how to run a Code Retreat event from the organizational and facilitator’s point of view. This material gave us a good understanding of the whole event, even though our setup was different, as we had no explicit facilitator and fixed workplaces, already prepared for pair programming usage. We didn’t invite external programmers to the event, so every participant was part of our development team. We had to end the event by 16 o’clock due to schedule conflicts and started at 9 o’clock, so our retreat count would be lower than 6 or even 7.

Basically, we tried to program Conway’s Game of Life within 45 minutes in pairs of two developers repeatedly. After the 45 minutes have passed (supervised by an alarm clock), we deleted the code and gathered for an iteration review of 15 minutes. Then, we started over again. This agenda should repeat throughout the day. No other activity or goal was planned, but we anticipated a longer retrospective meeting at the end of the day.

Execution of the Code Camp

The team gathered at 9 o’clock and performed setup tasks on the computers (like preparing a clean workspace). At 09:15, we held an introduction meeting for the Code Camp. I explained the basics and motives of Code Retreats and presented the rules for Conway’s Game of Life. The team heard most of the information for the first time, so nobody was particularly more experienced with the task or the conduct.

The first iteration started at 10 o’clock and had everybody baffled by the end of the iteration. The first retrospective meeting was interesting, as fundamental approaches to the problem were discussed with very little words needed for effective communication. Everybody wanted to move into the second iteration, which started at 11 o’clock.

At the end of the second iteration, two of the four teams nearly reached their anticipated goals. In the retrospective, the results were incredibly more advanced compared to the first iteration. This effect was similar to my first code camp: The second iteration is the breakthrough in the problem domain. Afterwards, the solutions are refined, but without the massive boost in efficiency compared to the other iterations except the first one.

We went to lunch early this day and returned with great ideas for the next round. After a short coffee break with video games, we started at around 13:45 for the third iteration.

The third iteration resulted in the first playable versions of the game. The solutions grew more beautiful and the teams began to experiment with their approaches, as the content-related task was mentally covered. This was the most productive iteration in terms of resulting software. But as usual, the code was deleted without a trace directly after the iteration. The iteration review meeting brought up a radically different approach on the problem as previously anticipated. This inspired every team for the fourth iteration.

In the fourth iteration, every team tried to implement the new approach. And every team failed to gain substantial ground, just like in the first iteration. The iteration review meeting was interesting, but we skipped another iteration in favor of the full retrospective of the Code Retreat.

Effects of the Code Camp

The Code Retreat iterations had great impact on our team. We discussed our impressions informally and then turned back to the formal retrospective questions as suggested by Alex Bolboaca:

  • How did you feel?
  • What have you learned?
  • What will you apply starting Monday?

The first question got answered by a “mood graph”, rising steadily from iteration one to three, with a yawning abyss at iteration four. This was another strong indicator that the iterations sort of restarted with iteration four.

The second question (“What have you learned?”) was answered more variably, but it stuck out that many keyboard shortcuts and little helpful IDE tricks were learnt throughout the day. We tracked the origin and propagation of two shortcuts and came to the result that one developer knew them beforehands, transferred the knowledge to the partner in the first iteration and both spread it further in the second iteration. By the end of the third iteration, everybody had learned the new shortcut. It was impressive to see this kind of knowledge transfer in such a clear manner.

The third question revolved around the coolest new shortcuts and tricks.

But we learned a lot more than just a few shortcuts. Most of all, we had a comparable coding experience with every other developer on our team. This isn’t about competition, it’s about personality. And we’ve found that the team works great in every combination. Some subtle fears of “being behind with knowledge” got diminished, too.

Future of the Code Camp

Everybody wants to do it again. So we’ll do it again. We decided to perform one Code Camp every three months. This isn’t too often to wear off, but hopefully often enough to keep our practice level high. We also decided to run dedicated Code Camps with external developers soon. The first event will happen in December 2010.

Shrink your dependency list with POCO

POCO is a nice set of C++ libraries which provides elegant solutions for day-to-day tasks.

When you write C++ applications of any sort you are very likely to need support libraries in addition to what comes with C++ (which is not much, btw). Of course, this holds true for any other language. But with Java and its rich JDK for example this need is not so imminent.

Starting at the very beginning, let’s see how fast the need for support arises.

int main(int argc, char** argv)
{
// parsing command line arguments
...

How to parse those command line arguments in a simple and easy way? How about a little help output when the program is called with -h or –help? Ok, we got boost::program_options for this.

Going further in your program you may want to have some sort of logging capability. Unfortunately, as of boost version 1.45 there is nothing to be found there. So you add a nice logging library.

And so on.

But wait! You don’t want to depend on too many 3rd party libraries because, among other things, they add deployment complexity.

Not even Qt, as one of the major players in the C++ framework world, provides solutions to both previous examples. As of version 4.7, no logging and not much support with command line arguments. And you end-up having to use QString, one of many non-std::string classes in C++ frameworks, which can get annoying at times (of course there are reasons why those exist).

I could go on with the list of smaller or larger concerns for which you either roll your own implementation or include yet another library in your project.

Instead I would like to point you to POCO, a nice set of C++ libraries which provide easy solutions for many basic and/or advanced day-to-day tasks. From their website:

Modern, powerful open source C++ class libraries and frameworks for building network- and internet-based applications that run on desktop, server and embedded systems

Besides very basic stuff like logging, date/time handling, threads, memory management, UTF-8, etc. they also provide lots of higher level classes for things like SMTP, POP3, SQL database access and HTTP. They even have a so called C++ Server Page Compiler which is basically something like JSP or Active Server Pages.

And they have no own string class! Yay! Instead they provide lots of functions classes and streams to do string manuipulation on good old std::string.

One thing I like most about POCO, though, is its clean, well-documented and apparently very high quality code. Although it is not overly functional or template-heavy, like you see it in in boost very often, it still provides elegant solutions.

Check it out and shrink your dependency list.

Diving into Hibernate’s Query Cache behaviour

Hibernate is a very sophisticated OR-Mapper and as such has some overhead for certain usage patterns or raw queries. Through proper usage of caches (hibernates featured a L1, L2 cache and a query cache) you can get both performance and convenience if everything fits together. When trying to get more of our persistence layer we performed some tests with the query cache to be able to decide if it is worth using for us. We were puzzled by the behaviour in our test case: Despite everything configured properly we never had any cache hits into our query cache using the following query-sequence:

  1. Transaction start
  2. Execute query
  3. Update a table touched by query
  4. Execute query
  5. Execute query
  6. Transaction end

We would expect that step 5 would be a cache hit but in our case it was not. So we dived into the source of the used hibernate release (the 3.3.1 bundled with grails 1.3.5) and browsed the hibernate issue tracker. We found the issue and correlated it to the issues HHH-3339 and HHH-5210. Since the fix was simpler than upgrading grails to a new hibernate release we fixed the issue and replaced the jar in our environment. So far, so good, but in our test step 5 still refused to produce a cache hit. Using the debugger strangely enough provided us a cache hit when analyzing the state of the cache and everything. After some more brooding and some println()'s and sleep()‘s we found the reason for the observed behaviour in the UpdateTimestampsCache (yes, yet another cache!):

	public synchronized void preinvalidate(Serializable[] spaces) throws CacheException {
		//TODO: to handle concurrent writes correctly, this should return a Lock to the client
		Long ts = new Long( region.nextTimestamp() + region.getTimeout() );
		for ( int i=0; i
			if ( log.isDebugEnabled() ) {
				log.debug( "Pre-invalidating space [" + spaces[i] + "]" );
			}
			//put() has nowait semantics, is this really appropriate?
			//note that it needs to be async replication, never local or sync
			region.put( spaces[i], ts );
		}
		//TODO: return new Lock(ts);
	}

The innocently looking statement region.nextTimestamp() + region.getTimeout() essentially means that the query cache for a certain “region” (e.g. a table in simple cases) is “invalid” (read: disabled) for some “timeout” period or until the end of the transaction. This period defaults to 60 seconds (yes, one minute!) and renders the query cache useless within a transaction. For many use cases this may not be a problem but our write heavy application really suffers because it works on very few different tables and thus query caching has no effect. We are still discussing ways to leverage hibernates caches to improve the performance of our app.

Open Source Love Day October 2010

Our Open Source Love Day for October 2010 brought love for the cmake hudson plugin. Other issues were addressed but not finished. If you like to type fast and accurate, we suggest you check out typeracer.

On Friday two weeks ago, we held our Open Source Love Day for October 2010. This day was special in several ways. We strayed very far from the usual schedule for this day, there were several internal tasks that couldn’t be delayed and we introduced a “fun practice” event. But we eventually produced something valuable this day.

The Open Source Love Day

We introduced a monthly Open Source Love Day (OSLD) to show our appreciation to the Open Source software ecosystem and to donate back. We heavily rely on Open Source software for our projects. We would be honored if you find our contributions useful. Check out our first OSLD blog posting for details on the event itself.

The distractions

  • A regular project needed an urgent cost estimation by the whole team. This was the last opportunity because of an upcoming parental leave to have the team together for a long time.
  • Another regular project needed an urgent problem solved. This turned out to be so obscure that one of our developers had to be on-site. You can read about it in this blog entry now.
  • We received several shipments of office furniture and computer parts. They had to be checked and placed in.
  • We had a fun practice event. We discovered the online “game” typeracer and practiced our raw typing skill against each other for some time. Pro tip for beginners: don’t look at the highscores!

On this OSLD, we accomplished the following tasks:

  • A new version 1.8 of the cmakebuilder hudson plugin implements several feature requests. You can now choose to NOT clean your workspace before building and set different paths for the cmake installation for every job or node (hudson slaves). The latter option can be applied using an environment variable.

On this OSLD, we also tried the following tasks:

  • We keep an eye on Scala and its associated web framework Lift as a promising technology. One issue with Lift that bugs us is the use of “sun bastard format” properties for internationalization. We tried to teach Lift to accept UTF-8 encoded property files. After a lot of “downloading the internet” (you can always tell which project uses maven by their initial setup delay), we quickly implemented our own ResourceBundle.Control. But the Lift framework itself could not be built: “Error occurred during initialization of VM: Could not reserve enough space for object heap”. We ran out of time and will investigate in this issue on the next OSLD.
  • Grails is another web framework we use in projects. There are some bugs that really annoy us, and the OSLD is the perfect time to fix them. One of these bugs is GRAILS-6475, which we tried to reproduce with the latest code base. After writing a test case that would go green unexpectedly, we tried to provoke the error by setting up a sample project. The bug didn’t show up there, too. We left a comment in the issuetracker and ceased development.

What were our lessons learnt today?

  • You can’t tear off massive amounts of time from the OSLD and expect it to still be working. An OSLD doesn’t scale down apparently.
  • Most issues that can’t be done fail with the project’s build. The build process of a foreign project is the most crucial phase in your decision on commitment. If it fails, your participation in the project is at risk. We’ve seen many brittle, undocumented and incredible complex build processes now. And we can state one thing: It doesn’t stop with throwing maven at a project, you still have to “think the build”.

Retrospective of the OSLD

This OSLD was special in the amount of non-OSLD work done. The remaining efforts weren’t as successful as we wished. This has been an ongoing issue with our OSLD for the last months now and we are looking forward to adapt our workstyle to yield better results in the future. The distraction by typeracer was fun, though.