Basic Image Processing Tasks with OpenCV

2D detectors and scientific CCD cameras produce many megabytes of image data. Open source library OpenCV is highly recommended as your work horse for all kinds of image processing tasks.

For one of our customers in the scientific domain we do a lot of integration of pieces of hardware into the existing measurement- and control network. A good part of these are 2D detectors and scientific CCD cameras, which have all sorts of interfaces like ethernet, firewire and frame grabber cards. Our task is then to write some glue software that makes the camera available and controllable for the scientists.

One standard requirement for us is to do some basic image processing and analytics. Typically, this entails flipping the image horizontally and/or vertically, rotating the image around some multiple of 90 degrees, and calculcating some statistics like standard deviation.

The starting point there is always some image data in memory that has been acquired from the camera. Most of the time the image data is either gray values (8, or 16 bit), or RGB(A).

As we are generally not falling victim to the NIH syndrom we use open source image processing librarys. The first one we tried was CImg, which is a header-only (!) C++ library for image processing. The header-only part is very cool and handy, since you just have to #include <CImg.h> and you are done. No further dependencies. The immediate downside, of course, is long compile times. We are talking about > 40000 lines of C++ template code!

The bigger issue we had with CImg was that for multi-channel images the memory layout is like this: R1R2R3R4…..G1G2G3G4….B1B2B3B4. And since the images from the camera usually come interlaced like R1G1B1R2G2B2… we always had to do tricks to use CImg on these images correctly. These tricks killed us eventually in terms of performance, since some of these 2D detectors produce lots of megabytes of image data that have to be processed in real time.

So OpenCV. Their headline was already very promising:

OpenCV (Open Source Computer Vision) is a library of programming functions for real time computer vision.

Especially the words “real time” look good in there. But let’s see.

Image data in OpenCV is represented by instances of class cv::Mat, which is, of course, short for Matrix. From the documentation:

The class Mat represents an n-dimensional dense numerical single-channel or multi-channel array. It can be used to store real or complex-valued vectors and matrices, grayscale or color images, voxel volumes, vector fields, point clouds, tensors, histograms.

Our standard requirements stated above can then be implemented like this (gray scale, 8 bit image):

void processGrayScale8bitImage(uint16_t width, uint16_t height,
                               const double& rotationAngle,
                               uint8_t* pixelData)
{
  // create cv::Mat instance
  // pixel data is not copied!
  cv::Mat img(height, width, CV_8UC1, pixelData);

  // flip vertically
  // third parameter of cv::flip is the so-called flip-code
  // flip-code == 0 means vertical flipping
  cv::Mat verticallyFlippedImg(height, width, CV_8UC1);
  cv::flip(img, verticallyFlippedImg, 0);

  // flip horizontally
  // flip-code > 0 means horizontal flipping
  cv::Mat horizontallyFlippedImg(height, width, CV_8UC1);
  cv::flip(img, horizontallyFlippedImg, 1);

  // rotation (a bit trickier)
  // 1. calculate center point
  cv::Point2f center(img.cols/2.0F, img.rows/2.0F);
  // 2. create rotation matrix
  cv::Mat rotationMatrix =
    cv::getRotationMatrix2D(center, rotationAngle, 1.0);
  // 3. create cv::Mat that will hold the rotated image.
  // For some rotationAngles width and height are switched
  cv::Mat rotatedImg;
  if ( (rotationAngle / 90.0) % 2 != 0) {
    // switch width and height for rotations like 90, 270 degrees
    rotatedImg =
      cv::Mat(cv::Size(img.size().height, img.size().width),
              img.type());
  } else {
    rotatedImg =
      cv::Mat(cv::Size(img.size().width, img.size().height),
              img.type());
  }
  // 4. actual rotation
  cv::warpAffine(img, rotatedImg,
                 rotationMatrix, rotatedImg.size());

  // save into TIFF file
  cv::imwrite("myimage.tiff", gray);
}

The cool thing is that almost the same code can be used for our other image types, too. The only difference is the image type for the cv::Mat constructor:


8-bit gray scale: CV_U8C1
16bit gray scale: CV_U16C1
RGB : CV_U8C3
RGBA: CV_U8C4

Additionally, the whole thing is blazingly fast! All performance problems gone. Yay!

Getting basic statistical values is also a breeze:

void calculateStatistics(const cv::Mat& img)
{
  // minimum, maximum, sum
  double min = 0.0;
  double max = 0.0;
  cv::minMaxLoc(img, &min, &max);
  double sum = cv::sum(img)[0];

  // mean and standard deviation
  cv::Scalar cvMean;
  cv::Scalar cvStddev;
  cv::meanStdDev(img, cvMean, cvStddev);
}

All in all, the OpenCV experience was very positive, so far. They even support CMake. Highly recommended!

CMakeBuilder Version 1.9

Introducing CMakeBuilder plugin version 1.9.

Today, I want to announce version 1.9 of the CMakeBuilder plugin for Jenkins (formerly known as Hudson). Concluding from the user feedback, there are no major missing features – at least for the moment.

So for this version, I implemented only one visible enhancement: It is now possible to use environment variables in every configuration setting. Even settings like “Preload Script” “Make Command” or “Install Command” can now be configured with the support of environment variables.

The major invisible change I did was the migration to the Jenkins development infrastructure using this very helpful guide. Moving the whole thing to git will be next.

Check it out!

SSL with POCO

A short introduction to using SSL support in POCO C++ libraries.

Admittedly, the topic of this post is very specific but I hope it will still be of some value for some people.The task for today is to setup SSL server and client with POCO framework classes. I will leave out the whole certificate managing issues and just assume that the right files are at hand.

The SSL related part of  the POCO libraries essentially wraps the OpenSSL library into a nice object-oriented interface. When you know OpenSSL, you can instantly relate to classes like Poco::Net::Context, or the …Handler classes (if you replace “handler” with “callback”).

“SSL” stands for Secure Socket Layer, so the first thing to discover is class Poco::Net::SecureServerSocket. As you would expect, this class is derived from Poco::Net::ServerSocket, extending it only with SSL related stuff. And sure enough, some constructors of Poco::Net::ServerSocket take a Poco::Net::ContextPtr as argument.

But why only some constructors? Since there is no setContext method, there must be some other mechanism in place by which SecureServerSockets get their SSL context.

Introducing Poco::Net::SSLManager. From the API docs:

SSLManager is a singleton for holding the default server/client Context and handling callbacks for certificate verification errors and private key passphrases.

Proper initialization of SSLManager is critical.

Aha! So all the constructors of SecureServerSocket that do not take Context pointers simply get it from the SSLManager singleton.

But how to initialize SSLManager?

1. The POCO Way:

If you developed your application with POCO from the ground up there probably exists a sub-class of Poco::Application, and all the configuration is handled by the built-in configuration classes.

With this in place, all you have to do is to add the proper ssl configuration elements:

openSSL.server.privateKeyFile = /path/to/key/file
openSSL.server.certificateFile = /path/to/certificate/file
openSSL.server.verificationMode = none
openSSL.server.verificationDepth = 9
openSSL.server.loadDefaultCAFile = false
openSSL.server.cypherList = ALL:!ADH:!LOW:!EXP:!MD5:@STRENGTH
openSSL.server.privateKeyPassphraseHandler.name = KeyFileHandler
openSSL.server.privateKeyPassphraseHandler.options.password = securePassword
openSSL.server.invalidCertificateHandler = AcceptCertificateHandler

2. Manually:

Depending on which side you are – client or server – you have to call SSLManager::initializeClient or  SSLManager::initializeServer. Both methods take three arguments:

  1. PrivateKeyPassphraseHandler pointer
  2. InvalidCertificateHandler pointer
  3. Context pointer

This is where it becomes a little bit tricky: If you try to instantiate a Context with a privateKey file in order to provide it as argument to the initialize… method, a PrivateKeyPassphraseHandler might be needed. This handler is fetched from the SSLManager singleton – which you are just about to initialize!.

This circular dependency between Context and SSLManager can be overcome e.g. if you call SSLManager::initializeServer first only with a PrivateKeyPassphraseHandler, a InvalidCertificateHandler and null Context pointer. Then instantiate the Context and call SSLManager::initializeServer again.

Now that SSL Manager is initialized we can use Secure… prefixed classes as we would used their non-SSL counterparts. As with SecureServerSocket, other Secure… classes are derieved from corresponding non-secure base classes.

Conclusion: Once you got around the initialization of SSLManager singelton, using SSL POCO classes is very easy and straight forward. Check it out!

Shrink your dependency list with POCO

POCO is a nice set of C++ libraries which provides elegant solutions for day-to-day tasks.

When you write C++ applications of any sort you are very likely to need support libraries in addition to what comes with C++ (which is not much, btw). Of course, this holds true for any other language. But with Java and its rich JDK for example this need is not so imminent.

Starting at the very beginning, let’s see how fast the need for support arises.

int main(int argc, char** argv)
{
// parsing command line arguments
...

How to parse those command line arguments in a simple and easy way? How about a little help output when the program is called with -h or –help? Ok, we got boost::program_options for this.

Going further in your program you may want to have some sort of logging capability. Unfortunately, as of boost version 1.45 there is nothing to be found there. So you add a nice logging library.

And so on.

But wait! You don’t want to depend on too many 3rd party libraries because, among other things, they add deployment complexity.

Not even Qt, as one of the major players in the C++ framework world, provides solutions to both previous examples. As of version 4.7, no logging and not much support with command line arguments. And you end-up having to use QString, one of many non-std::string classes in C++ frameworks, which can get annoying at times (of course there are reasons why those exist).

I could go on with the list of smaller or larger concerns for which you either roll your own implementation or include yet another library in your project.

Instead I would like to point you to POCO, a nice set of C++ libraries which provide easy solutions for many basic and/or advanced day-to-day tasks. From their website:

Modern, powerful open source C++ class libraries and frameworks for building network- and internet-based applications that run on desktop, server and embedded systems

Besides very basic stuff like logging, date/time handling, threads, memory management, UTF-8, etc. they also provide lots of higher level classes for things like SMTP, POP3, SQL database access and HTTP. They even have a so called C++ Server Page Compiler which is basically something like JSP or Active Server Pages.

And they have no own string class! Yay! Instead they provide lots of functions classes and streams to do string manuipulation on good old std::string.

One thing I like most about POCO, though, is its clean, well-documented and apparently very high quality code. Although it is not overly functional or template-heavy, like you see it in in boost very often, it still provides elegant solutions.

Check it out and shrink your dependency list.

Combine cobertura with the awesomeness of crap4j

Want the awesomeness of crap4j without running your tests twice in your build? Just combine it with your cobertura data using crapertura.

You may have heard of crap4j when it was still actively developed. Crap4j is a software metric that points you to “crappy” methods in your projects by combining cyclomatic complexity numbers with test coverage data. The rationale is that overly complex code can only be tamed by rigorous testing or it will quickly reduce to an unmaintainable mess – the feared “rotten code” or “crappy code”, as Alberto Savoia and Bob Evans, the creators of crap4j would put it. The crap4j metric soon became our most important number for every project. It’s highly significant, yet easy to grasp and mandates a healthy coding style.

Some enhancements to crap4j

Crap4j got even better when we developed our own custom enhancements to it, like the CrapMap or the crap4j hudson plugin. We have a tool that formats the crap4j data like cobertura’s report, too.

A minor imperfection

The only thing that always bugged me when using crap4j inside our continuous integration build cycle was that at least half the data was already gathered. Cobertura calculates the code coverage of our tests right before crap4j does the same again. Wouldn’t it be great if the result of the first analysis could be re-used for the crap metric to save effort and time?

Different types of coverage

Soon, I learnt that crap4j uses the “path coverage” to combine it with the complexity of a method. This is perfectly reasonable given that the complexity determines the number of different pathes through the method. Cobertura only determines the “line coverage” and “branch coverage”. As it stands, you can’t use the cobertura data for crap4j because they represent different approaches to measure coverage. That’s still true and probably will be for a long time. But the allurement of the shortcut approach was too high for me to resist. I just tried it out one day to see the real difference.

A different metric

So, here it is, our new metric, heavily inspired by crap4j. I just took the line and branch coverage for every method and multiplied them. If you happen to have a perfect coverage (1.0 on both numbers), it stays perfect. If you only have 75% coverage on both numbers, it will result in a “crapertura coverage” of 56,25%. Then I fed this new coverage data into crap4j and compared the result with the original data. Well, it works on my project.

Presenting crapertura

Encouraged by this result, I wrote a complete ant task that acts similar to the original crap4j ant task. You can nearly use it as a drop-in replacement, given that the cobertura XML report file is already present. Here is an example ant call:


<crapertura
coberturaReportFile="/path/to/cobertura/coverage.xml"
targetDirectory="/where/to/place/the/crap4j/report"
classesDirectory="/your/unarchived/project/class/files"
/>

It will output the usual crap4j report files to the given target directory. Please note that even if it looks like crap4j data, it’s a different metric and should be treated as such. Therefore, online comparison of numbers is disabled.

The whole project is published on github. Feel free to browse the code and compile it for yourself. If you want a binary release, you might grab the latest jar from our download server.

The complete usage guide can be found on the github page or inside the project. If you have questions or issues, please use the comment section here.

Conclusion

If crapertura is able to give you nearly the numbers that crap4j gave you is up to your project, really. Our test project contained over 20k methods, but very little crap. The difference between crap4j and crapertura was negligible. Both metrics basically identified the same methods as being crappy. Your mileage may vary, though. If that’s the case, let us know. If your experience is like ours, you’ve just saved some time in your build cycle without sacrificing quality.

Improved Version of CMake Builder for Hudson

Introducing version 1.5 of cmake builder plugin for Hudson.

Today I just want to give a small round-up of the improvements made on the cmake builder plugin since my last blog post. Back then, version 1.2 was released to support master/slave configurations. As of yesterday, we are at version 1.5 which contains the following improvements/bug-fixes:

  • Bug: The drop-down box for selecting the build type didn’t remember its value. This was fixed with a patch by Atte Timonen.
  • Improvement: Also included in Atte’s patch was the propagation of environment variables to the cmake command which now allows to do parameterized builds. A big thank’s to Atte!
  • Improvement: The install command gets only executed when install directory and install command is given. Before, the build was either broken or $WORKSPACE was used automatically as install directory. Thanks to Dat Chu for his feedback.
  • Improvement: The one-line ‘Other CMake Arguments’ field can get full pretty quickly, so it was changed to a multi-line text-area.

Thank’s again for the feedback, and have fun with the new version!

CMake Builder Plugin in Master/Slave Setups

Making the CMake Builder plugin for Hudson behave in master/slave settings.

The first versions of the cmake builder plugin were developed more or less only driven by our own needs. As people began to use it an issue came up that we hadn’t considered yet: distributed builds, a.k.a master/slave mode. So on our first OSLD in 2010 I looked into the plugin and began to rectify the situation.

My test setup consisted of a hudson master on WindowsXP box which was connected via SSH to a slave node in a Ubuntu virtual machine. The first errors were easy to find. The plugin tried to find all configured paths on the windows host and not on the ubuntu slave.

Experience from our previous Crap4J plugin development and a quick read here brought me on the right track. It’s not a good idea to use just java.io.File if you want your plugin to be master/slave capable – use hudson.FilePath instead.

So after replacing all java.io.File occurrences with hudson.FilePath the situation was much better. The plugin handled all paths correctly but still produced errors when calling cmake. I quickly discovered that java.lang.Process and java.lang.ProcessBuilder were used to call “cmake -version”. Again, not a good idea – hudson.Launcher is your friend here.

After replacing Process with Launcher I had only one strange error left. The following launcher call using a nice fluent interface wouldn’t execute on the remote machine but insisted to execute locally.

launcher.launch().cmds(cmakeCall).envs(environmentVars)
   .stdout(listener).pwd(workDir).join();

When I changed it to the seemingly equal statement

launcher.launch(cmakeCall, environmentVars,
    listener.getLogger(), workDir).join();

it worked like a charm.

After all those changes I proudly present the newest version of CMake Builder Plugin which is now ready to be used in distributed environments.

Only one little unpleasantness still exists, though: when configuring the make and install commands the plugin tries to find the executables on the PATH of host machine. For now, you can just ignore the error message. I try to look into it, soon. Apart from that, have fun with the new version.

Database Versioning with Liquibase

In my experience software developers and database people do not fit too well together. The database guys like to think of their database as a solid piece and dislike changing the schema. In an ideal world the schema is fixed for all time.

Software developers on the other hand tend to think about everything as a subject to change. This is even more true for agile teams embracing refactoring. Liquibase is a tool making database refactorings feasible and revertable. For the cost of only one additional jar-file you get a very flexible tool for migrating from one schema version to another.

Using Liquibase

  • You formulate the changes in XML, plain SQL or even as custom java migration classes. If you are careful and sometimes provide additional information your changes can be made rollbackable so that changing between schema revisions becomes a breeze.
  • To apply the changes you simply run the liquibase.jar as a standalone java application. You can specify tags to update or rollback to or the count of changesets to apply. This allows putting the database in an arbitrary state within changeset granularity.

Additional benefits

  • An important benefit of Liquibase is that you can easily put all your changesets under version control so that they are managed exactly the same as the rest of the application.
  • Liquibase stores the changelog directly in the database in a table called databasechangelog. This enables the developer and the application to check the schema revision of the database and thus find inkonsistent states much easier.

Conclusion
All of the above is especially useful when multiple installations or development/test databases with different verions of the software and therefore database have to be used at the same time. Tracking the changes to the database in the repository and having a small cross platform tool to apply them is priceless in many situations.

Open Source Love Day December 2009

Our Open Source Love Day for November 2009 brought love for EGit and several hudson plugins. We got slightly frustrated over KDevelop4.

On Tuesday, we had our last regular working day for 2009. We celebrated this circumstance by having our fourth Open Source Love Day (OSLD). The day was successful, you can review our list of today’s achievements below.

We introduced a monthly Open Source Love Day to show our appreciation to the Open Source software ecosystem and to donate back We heavily rely on Open Source software for our projects. We would be honored if you find our contributions useful. Check out our first OSLD blog posting for details on the event itself.

Participate at our OSLD by using the features we’ve built today:

  • Our campfire plugin for hudson was updated to version 1.1. The new version contains the improvements Mark Woods suggested (global configuration and login recovery). Thank you, Mark!
  • The campfire plugin also switched the communication model from webpage scraping to the brand new campfire API. This should improve the stability of the plugin.
  • Some of the EGit (git plugin for eclipse) patches we sent in at the last OSLD needed some rework and polishing. You can review the details in EGit’s code review system gerrit: change 121 and change 122.
  • Our cmake hudson plugin was updated to version 1.1. The new version checks the environment (installed cmake version, etc.) before delegating the call and provides better error messages.
  • We started working on a feature of KDevelop4 that was present in KDevelop3 and is now missing: “Compile file”. The progress was slowed down by some problems. See below for details.
  • Hudson got a new major version of the IRC plugin from Christoph Kutzinksi. The plugin was in a rather desolate state before. We had used a private fork with specific additions to control our infrastructure. The plugin was on our list of OSLD patients, when Christoph merged it with the hudson instant-messaging plugin and introduced a multitude of cool new features. We beta-tested the new version and it was great. The only drawback was the complete alteration in message syntax that broke our infrastructure. So in order to scratch our own itch, we programmed a little API to parse hudson IRC plugin messages of the new 2.X version stream. Our code is published on github, have a look if you are interested and drop us a line if you found it useful.

What were our lessons learnt today?

  • If maven decides to work properly, everything is really cool.
  • Just because you use JGit/EGit on top of Eclipse, all three being platform independent, doesn’t save you from slash vs. backslash issues. EGit’s initial user experience is better on unixoid platforms than on windows systems. The patch #141 helped us beyond the showstopper of unrecognized local repositories.
  • We acquired an additional share of eclipse plugin development knowledge when polishing our EGit features.
  • Working with git and gerrit is challenging on first encounter. We are constantly learning in this area.
  • Bugzilla fails to present open issues in a manner where you can quickly pick an issue of interest. If you really want to use it for your open source project, think of a scraped website that only lists the “low hanging fruits” for newbie developers.
  • KDevelop4 has outdated documentation, the projects kdevplatform and kdevelop were moved inside the repository.
  • If you encounter a rather erratic error stating that “KDE4Workspace not found”, try excluding the debuggers/gdb subproject from your build.
  • Most of us used the waiting delays of one project (“oh, maven is downloading the internet again”) to switch over to a secondary task. So this event trains our multitasking abilities right along.

In summary, this OSLD was a fun way to end a workyear on heavy duty. We will continue to celebrate OSLDs in 2010, as it’s a fun way to peek into foreign projects, learn a lot in short time and contribute to the community.

Open Source Love Day November 2009

Our Open Source Love Day for November 2009 brought love for EGit, the Eclipse git plugin and some frustration over lacking documentation.

Today, we celebrated our third Open Source Love Day (OSLD). It was slightly degraded by a company-wide illness outbreak, but we tried to make the best out of the situation.

Open Source Love Days are our way to show our appreciation and care to the Open Source software ecosystem. Our work wouldn’t be as fun or just not possible without professional Open Source software. You can read more about our motivation and specifica in our first OSLD blog posting.

You can participate at our OSLD by using the features we’ve built today:

  • We think git is an enrichment to the SCM market, as is Eclipse to the IDE market. Improving the quality of Eclipse’s git plugin is the next logical step. The ability to diff the content of two revisions in EGit was committed today. As a bonus, the name of the committer shows up in the right manner, too. See the screenshots to get the idea:

When developing EGit, we were already using it to pull the sources. Unfortunately, the repository URL changed bigtimes since our checkout without us noticing. This got us into trouble trying to follow the contributor guide. The command line version of git isn’t that communicative yet. But after all, this is a great time to learn about the real world problems when using git. The EGit contributor guide itself is a fantastic way for a project to show initial appreciation to volunteer efforts. Thanks for caring, guys! If you are interested to review our changes yourself, fetch the patch.

  • Another part of today’s work was on the KDevelop project. We tried to fix some outstanding little features or bugs, whatever is on the list of KDevelop 4. But we spent our day fixing our development machines instead. The Ubuntu linux operating system (8.10) was way too old to get useful results and KDE needs to be up-to-date to develop KDevelop. Besides our sluggishness to keep our virtual machines on the bleeding edge, the checkout experience of KDevelop was rather sleek. What bothers us a bit is the ominous entanglement between KDevelop and KDE. It seems you can’t have one without the other and need to master both to make a stand.
  • As a third part, we wanted to contribute to the TANGO project (not the useful icon collection, but the useful control system). They migrated their main repository from CVS to SVN lately, but the migration seems unfinished still. At least, the migration effort lacks public documentation for the occasional contributor. That’s a real showstopper, because you never get beyond the very first step: setting up a working project. We won’t give up and email the project leads on this topic, but it didn’t fit into this OSLD.

What were our lessons learnt today?

  • Just having a possibility to view or download the source code doesn’t make an Open Source project. The key to success is the ability of complete strangers to hop in and perform useful work. Having terse, but accurate documentation helps a lot. The EGit contributor guide is a good example of a single document that makes the difference. If you own an open source project and want to attract occassional contributors (like us), write such a document and watch us (and others) drop you a patch. That said, we come to belief that the person that writes technical documentation for the developers is one of the most important roles on a project. Perhaps we join some projects in the future to fill that role.

To sum it up, this OSLD was limited from the beginning by developer availibility. With lacking documentation, we nearly grinded to a halt. We look forward to our next OSLD in December.