A big benefit of Convention over Configuration

Convention over Configuration helps a lot getting a fast start on an existing project.

Recently I joined a long running Grails project and had to complete some issues in a short time. After a quick introduction I was ready to dive in and instantly could chunk out code. Convention over Configuration helped a lot getting a fast start:

  • Which classes are persisted? Look into the domain folder.
  • What controller handles xyz? It is named after xyz.
  • What is the page that is served for URL u? Look into the corresponding folder and find the view named accordingly.

IMHO this is a benefit that many conventions are determined by the framework. If you know the framework, you know the layout of the project. The opposite holds also true: if the project departs from these “standards” you have to look closely into the configuration.
So think twice before you do something your way. Sometimes you have no choice like when you are using Oracle and have to cut all table and column names to 30 characters. But normally you should keep the defaults.

Bogus Error Messages with Qt .ui Files

Name your Qt Forms correctly and you will save lots of debugging time.

Bogus errors together with their messages can have a large number of reasons – full hard drives being one of the classics. When it comes to programming and especially C++, the possibilities for cryptic, meaningless and misleading error message are infinite.

A nice one bit us at one of our customers the other day. The message was something like

QLayout can only have instances of QWidget as parent

and it appeared as standard error output during program start-up. Needless to say that the whole thing crashed with a segmentation fault after that. The only change that was made was a header file that was added to the Qt files list in the CMakeLists.txt file.  The Qt class in this header file was just in its beginnings and had not yet any QLayouts, or QWidgets in it. Even the  C++ standard measure of cleaning and recompiling everything didn’t help.

So how is it possible that an additional Qt header file that has not references to QLayout and QWidget can cause such an error message?

As all of you experienced C/C++ developers know, for the compiler, a code file is not only the stuff that it contains directly but also what is #included! The offending header file included a generated ui description file which you get when you design your windows – or Forms in Qt terminology – with the Qt designer and use the Compile-Time-Form-Processing-approach to incorporate the form into the code base.

But how can that effect anything?

The Qt designer saves the forms into .ui files. From that, the so-called User Interface Compiler (uic) generates a header file containing a C++ class together with inlined code that creates the form. Form components like line edits, or push buttons are generated as instance attributes. The name of the class is generated from the name of the form. You can even use namespaces.  By naming it e.g. myproject::BestFormEverDesigned the generated class is named BestFormEverDesigned   is put into namespace myproject.

So far, so nice, handy and easy to use.

When you create a new form in qt designer, the default name is Form. Maybe you can guess already where this leads to…

Two forms for which the respective developers forgot to set a proper name, existed in the same sub project and had been compiled and linked into the same shared library. The compiler has no chance to detect this, because it sees only one

class Form
{

at a time. The linker happily links all of this together since it thinks that all Forms are created equal. And then at run-time … Boom!

I will have to look into a little Jenkins helper which breaks the build when a Form form is checked in…

Your own dsl: a primer on operators

When writing your own domain specific language (dsl), a full fledged parser generator like antlr can be very helpful with the nitty gritty. You may come to a point where you want to use (infix) operators in your language. But beware! A naive solution might look like:

expr: number '+' expr | number '-' expr | number '*' expr | number '/' expr | …;

If you want to support mathematical like operators this solution misses two important traits: operator precedence and associativity.
Precendence can be easily achieved:

expr: term '+' expr |  term '-' expr | …;
term: number '*' term | number '/' term | …;

The operators with the lowest precedence come first, then the next and so on. Unfortunately this has one side effect: the operators are now right associative.
Which means an expression like 5 – 4 + 3 would evaluate to -2 and not 4. Because of right associativity it is the same as 5 – (4 + 3). So another refinement does the trick:

expr: term (('+'  |  '-' | …) term)*;
term: number (('*' | '/' | …) number)* | …;

How to accidentally kill your CI build time

At one of our customers I do C++ consulting in a mid-sized project which uses cmake as build system. A clean build on our Jenkins CI server takes about 40 minutes (including unit tests) which is way too long to be considered “fast feedback” in an agile kind of way.

Because of that, we do clean builds only 2 times a day – some time during the night and during lunch break. The rest of the day the CI server only does a “svn update” and a normal “make”, which takes about 3-10 minutes depending on what files have been changed.

With C++ there are lots of ways to unnecessarily lengthen your build time. The most important factor is, of course, #include dependencies. One has to be very (very) disciplined in adding #include directives in header files. Otherwise, the whole world suddenly gets rebuild when some small header file somewhere in a little corner of the code has been changed.

And I have to say, for the most part, this project is in pretty good shape with regard to #include dependencies.

So what the hell has suddenly increased our build time from 3-10 minutes to 20-25 minutes? was what I was thinking some time last week while waiting for the CI server to spit out new latest and greatest rpm packages. For some reason, our normal, rest-of-the-day build started to compile what felt like everything in our main package even on the slightest code change in a remote .cpp file.

What happened?

In order to have the build time available (e.g. to show in an “about” box), we use a preprocessor symbol like REVISION_DATE which gets filled in a CMakeLists.txt file. The whole thing looks like this:

...
EXEC_PROGRAM(date ARGS '+%F_%T' OUTPUT_VARIABLE REVISION_DATE)
...
ADD_DEFINITIONS(-DREVISION_DATE=\"${REVISION_DATE}\")
...

Since the beginning of the time these lines of CMake code lived in a small sub-sub-..-directory with little to no incomming dependencies. Then, at some point, it became necessary to have the REVISION_DATE symbol at some other place, too, which led to a move of the above code into the CMakeLists.txt file of the main package.

The value of command date +%F_%T changes every second which leads to a changed REVISION_DATE on every build – which is what we initially intended. What changes, too, of course, is the value of the ADD_DEFINITIONS directive. And as CMake is very strict with the slightest change in this value, every make target below that line gets rebuild – which in our case was everything in the main package.

So there! Build time killing creatures are lurking everywhere in our C/C++ projects. Always be aware of them!

Looping in C++

What is “the best” way to loop over collections in C++?

One recurring discussion point in one of our customers C++ project team is the following:

What is “the best” way to loop over collections?

In a typical scenario there is a standard container like std::list, or some equivalent collection, and the task is to do something with every element in the collection. The straight forward way would be like this:

std::list<std::string> mylist;
for (std::list<std::string>::iterator iter = mylist.begin(); iter != mylist.end(); iter++)
{
   ...
}

This code is correct and readable. But my guess is that most of you instantly see at least two possible improvements:

  1. the call to mylist.end() occurs in every loop an can be expensive e.g. in case of long std::lists
  2. iter++ creates one unnecessary intermediate object on the stack

So this

for (std::list<std::string>::iterator iter = mylist.begin(), end = mylist.end(); iter != end; ++iter)
{
   ...
}

would be much better but can already be seen as a little less readable.

Using BOOST_FOREACH can save you much of this still tedious code but has one nasty pitfall when it comes to std::maps.

In some places of the code base std::for_each is used together with a function, or function object.  The downside of this is that the function/function object code is not located where the loop occurs. However, this can be made “readable enough” when the function, or function object does only one thing and has a telling name.

Looping is sometimes done to create other collections of objects for each element. What to do there? Define the new collection use a for-loop of BOOST_FOREACH like above, or use std::transform with the same downside as std::for_each?

The other day one team member suggested to use boost::lambda expressions in loops. The initial usage examples where very promising but let me tell you – readability can drop dramatically very fast if you don’t be careful. It is very easy to get carried away with boost’s lambdas. I happened that we found ourselves having spent the last hour to carve out a super crisp lambda expression that takes anybody else another hour to read.

So the initial question remains undecided and will most likely stay like that. As for everything else in programming, there doesn’t seem to be a silver bullet for this task.

How do you go about looping in C++? Do you have some kind of coding style in place? Do you use std::for_each, BOOST_FOREACH, or some other means?

Looking forward to some feedback.

Clean code is not enough

Not only well-crafted software, but useable and delightful software.

I keep hearing stuff like:
“we as software developers are craftsmen and should honor our craft and write clean code”

Using the metaphor of a craftsman we should also realize that we are building software for people (to use) not for its own sake.
Imagine a chair which is perfectly crafted and beautiful to look at but you can’t sit on it?
It might be art but nobody can use it for its original purpose.

Most if not all of the software we write is for people to use, to be empowered and yes, to be delighted.
But to what use (besides art) is a software which is cleanly built but unusable?
We as software developers have shied away for too long from learning to craft useable interfaces.
I think we should not neglect that we develop software for others to use.
A program is not an island, it only excels when it interacts with users or other programs.

Not only well-crafted software, but useable and delightful software.

CMakeBuilder Version 1.9

Introducing CMakeBuilder plugin version 1.9.

Today, I want to announce version 1.9 of the CMakeBuilder plugin for Jenkins (formerly known as Hudson). Concluding from the user feedback, there are no major missing features – at least for the moment.

So for this version, I implemented only one visible enhancement: It is now possible to use environment variables in every configuration setting. Even settings like “Preload Script” “Make Command” or “Install Command” can now be configured with the support of environment variables.

The major invisible change I did was the migration to the Jenkins development infrastructure using this very helpful guide. Moving the whole thing to git will be next.

Check it out!

Grails: The good, the bad, the ugly

(Opinion!) After 3 years of Grails development it is time to take a step back and look how well we went.

After 3 years of Grails development it is time to take a step back and look how well we went.
(Info: we made several Grails apps ranging from small (<15 domain classes) to medium sized (50-70 domain classes) using front ends like Flex/Flash and AJAX)

The good parts

Always start with praise. So I tell you what in my opinion was and is good about developing in Groovy and Grails.

Groovy is Java with sugar

The Groovy syntax and the type system are so close to Java, so that when you come from a Java background you feel right at home.

Standard web stack

If you are accustomed to standard technologies like Spring and Hibernate you see Grails as a vacation.

Sensible defaults aka Convention over configuration

Many of the configuration options are filled with sensible defaults.

Fast start

You get from 0 to 100 in almost no time.

The bad things

Things which are not easily avoided.

Bugs, bugs, bugs

Grails has many, many bugs, unfortunately even in such fundamental things such as data binding and validation. A comment from a previous blog post: “To me, developing with Grails always felt like walking on eggs.”

Regression

Some bugs sneak back in again or are even reopened. Note that this is not the same as bugs, bugs, bugs because fixed bugs should be secured by a test.

Leaky abstractions

You have to know the underlying technologies especially Hibernate and Spring to get a foot on the ground. The GORM layer inherits all the complexity from Hibernate.

Slow integration tests

The ramp up time is 45 s on a decent build/development machine and then the first test hasn’t even started.

Uses the Java way of solving problems

Got a problem? There’s a framework for that!

Abandoned or prototype like plugins

Take a look at the list of plugins like Autobase, Flex.

Problems with incremental compiling

Don’t know where the real cause is buried: but using IntelliJ for developing Grails projects results in comments like:
Not working? Have you cleaned, invalidated your caches, rebuilt your project, deleted the .grails directory?

The ugly things

Things which are easily avoided or just a minor issue.

Groovys use of == and equals

Inherited from Java and made even worse: compare two numbers or a String and a GString

Groovys definition for the boolean truth

0, [], “”, null, false are all false

Groovys use of the NullObject and the plus operator

Puzzler: what is null + null ?

Uses unsupported/discontinued technologies

Hibernates SchemaExport comes to mind.

Mix of technology and intention

hasMany, hasOne, belongsTo have not only an intention revealing function but also determine how cascading works and the schema is generated.

Summary, opinionated

Grails has deficits and is bug ridden. But this will be better in the future (hopefully).
When you compare Grails with standard web stacks in the Java world you can gain a lot from it.
So if you want to know if you should use Grails in your next project ask yourself:

  • do you have or want to use Spring and Hibernate?
  • can you live without static typing? (remember: with freedom comes responsibility)
  • are you ready to work around or even fix an issue or bug?
  • is Java your home?

If you can answer all those questions with Yes, then Grails is for you. But beware: no silver bullets!

Podcasts

Podcasts are a very good means to shorten your commute, to keep you entertained during otherwise boring house-keeping activities, or, if you’re into sports, during your training sessions. Here is a list of some of my favourite shows.

This Developer’s Life

Rob Conery and Scott Hanselman interview developers and other IT professionals who share their stories. Very interesting, very well edited and flavoured with some nice pieces of music.

TechZing

Basically, TechZing are two guys, Jason Roberts and Justin Vincent, who discuss different topics concerning their lives as freelance web developers and startup bootstrappers. They enjoy themselves very much just talking to each other which is very entertaining already. The occasional interview and panel shows are then the icing on the cake.

It’s impossible to give a clear range of  topics since they consist of technical stuff like ‘how to store images in web applications’, SEO, NoSQL, JavaScript and iPhone development, but also non IT stuff like Pioneer One, geological challenges, and the Luck-Surface-Area. Edutainment at its best! Highly recommended!

Software Engineering Radio

This is purely an interview show which addresses all sorts of topics of interest for professional software developers: languages, platforms, technologies, methodologies, etc. Very informative, high profile guests and very competent hosts. Unfortunately, the output rate has gone down a lot in the last year.

Software ArchitekTOUR Podcast

This german (with little bits of swabian) speaking podcast is mostly concerned with topics around software architecture (as the name already suggests). DSLs, NoSQL databases and REST have been some of the latest topics.

FLOSS Weekly

Randal Schwartz (mostly) and other hosts are talking about Free Libre Open Source Software projects, ranging from whole OSes like CentOS to smaller niche projects like Ledger. Great show if you want to know what’s going on in the Open Source world.

Security Now

Steve Gibson and Leo Laporte talk about everything related to IT security. This will keep you informed about the latest browser vulnerabilities, Adobe Flash updates and Windows patches. But you will also learn e.g. how SSL works, the details of Stuxnet and everything about BitCoins. Don’t miss the all-time favourite episode 248: The Portable Dog Killer.

What are your favourite shows?

The Grails performance switch: flush.mode=commit

Some default configuration options of Grails are not optimal for all projects.

— Disclaimer —
This optimization requires more manual work and is error prone but isn’t this with most (big) performance improvements?
For it to really work you have to structure your code accordingly and flush explicitly.

Recently in our performance measurements of a medium sized Grails project we noticed a strange behavior: every time we executed the same query the time it took increased. It started with 40ms and every time it took 1 ms more. The query was simple like Child.findAllByParent(parent)
The first thought: indexes! We looked at the database (a postgresql db) and we had indexes on the parent column.
Next: maybe the session cache got too large. But session.flush() and session.clear() did not solve that problem.
Another post suggested using a HQL query. Changing to

Child.executeQuery("select new Child(c.name, c.parent) from Child c where parent=:parent", [parent: parent])

had no effect.
Finally after countless more attempts we tried:

session.setFlushMode(FlushMode.COMMIT)

And not even the query executed in constant time it was also 10x faster?!
Hmmm…why?
The default flush mode in Grails is set to AUTO
Which means that before every query made the session is flushed. Every query regardless of the classes effected. The problem is known for hibernate but after 4! years it is still unresolved.
So my question here is: why did Grails chose AUTO as default?