Our second Open Source Love Day (OSLD)

A retrospective report of our second Open Source Love Day (OSLD). We present the results of our work on hudson and git and the lessons learnt.

opensourceloveday

Today we celebrated our second Open Source Love Day (OSLD). When we say “celebrated”, we actually mean that all of us worked hard and concentrated for hours, just to have a short meeting with candy at the end of the day.

The Open Source Love Day is our way to show our appreciation to the Open Source software development ecosystem. We heavily rely on Open Source products for our customer projects, so it’s just fair to donate back. You can read more about our motivation and specifica in our first OSLD blog posting.

For this day, we adjusted the rules a bit. While Object Calisthenics are very powerful in formulating rather academic software development values in some easy-to-remember rules, they just don’t fit well with existing projects. We still kept the rules in mind, but didn’t follow them strictly. We also learnt our share from last time’s experience of jumping right into the middle of arbitrary projects without a real need to do so. Today, we scratched more of our own itches.

You can participate at our OSLD by using the feature we’ve built today:

  • Hudson gets a brand new plugin. Currently, it’s in alpha status and needs some more nurturing, but is planned to be published within the next few weeks. The proof of concept was successful today. You will read more about it on this blog soon.
  • Another of our hudson plugins, namely the cmake builder plugin, got some feature love, incorporating suggestions from plugin users. We especially thank Ole B. for his feedback. The new features are checked in and will be available with the next plugin version 0.6, scheduled to be published in a few days. You’ll read the details about the new features here.
  • We’ve produced a feature implementation for hudson, adding the ability to use environment variables for the job’s workspace path. This feature touches core hudson functionality, so we just proposed a patch and leave it up to the core hudson team to decide on its inclusion. For more details, head over to the hudson issue tracker, entry #3997.
  • And we didn’t forget about git. As we are multi-IDE users (today’s development took place using NetBeans, IDEA and Eclipse), the EGit eclipse plugin for git will soon have the ability to diff the content of two revisions. An undocumented method argument took too much time to finish the feature today. After email communication with the project owner, the feature works on our machine, but needs some polishing before being committed in the near future.

As you can see, the hudson continuous integration server received a great share of our today’s love. It’s a great tool with a great community that really deserves our contributions.

What were our lessons learnt today?

  • While implementing the variable expansion feature, the author got distracted by a similar concept and followed this red herring. Namely, instead of a hudson.util.VariableResolver, we needed to use the hudson.EnvVars class. The EnvVars are pre-filled with all global variables like JAVA_HOME, while the VariableResolver is not. This could have been avoided by looking at the actual code instead of just type names. Once you think you’ve found your type, you read code the wrong way just to sustain your assumption.
  • To implement advanced plugin features, whether for hudson or eclipse, is a matter of skill with the “monkey see, monkey do” development style. Documentation is mostly non-existent or out-dated.
  • When handling HTML and HTTP in java, some survival tricks are crucial. Stay tuned for a whole blog post on that topic.
  • We still don’t feel comfortable within the JGit source code, as we still lack advanced git feature and terminology knowledge and the project lacks documentation. Our part of the problem will decline over time, as it’s a question of tool/mindset/slang exposure.

To sum it up, this OSLD worked out much better than our first one. We had more fun and yielded better results, mostly because we adjusted our goals to better suit our working style.

What are your experiences with open source development? Drop a comment!

Object Calisthenics On Existing Projects?

A few days ago we discussed Object Calisthenics which where introduced by Jeff Bay in an article for the ThoughtWorks Anthology book. In case you have no idea what I’m talking about, here are again the 9 rules in short form (or your can study them in detail in the book):

1. One level of indentation per method
2. No else keyword
3. Wrap all primitives and strings
4. Use only one dot per line
5. Don’t abbreviate names but keep them short
6. Keep all entities small
7. No more than two instance variables per class
8. Use first-class collections
9. Don’t use any getters/setters or properties

Following the rules supposedly leads to more object-oriented code with a special emphasis on encapsulation. In his article, Jeff Bay suggests to do a new 1000 lines project and to follow the rules excessively without thinking twice. But hey, more object-oriented code can’t be bad for existing projects, either, can it?

Not only on the first look, many of the rules seem pretty hard to follow. For example, check your projects for compatibility with rule 7. How many of your classes have more than two instance variables? That’s what I thought. And sure, some primitives and collections deserve wrapping them into an extra class (rules 3 and 8), but do you really wrap all of them? Well, neither do we.

Other rules lead directly to more readable code. If you value good code quality like we do, rules 1, 2, 5 and 6 are more or less already in the back of your head during your daily programming work.

Especially rule 1 is what you automatically aim for when you want your crap load to remain low.

What really got my attention was rule 9: “Don’t use any getters/setters or properties”. This is the “most object-oriented” rule because it targets the heart of what an object should be: a combination of data and the behavior that uses the data.

But doing a little mental code browsing through our projects, it was easy to see that this rule is not easily retrofitted into an existing code base. The fact that our code is generally well covered with automated tests and considered awesome by a number of software metrics tools does not change that, either. Which is, of course, not surprising since committing to rule 9 is a downright big architectural decision.

So despite the fact that it is difficult to virtually impossible to use the rules in our existing projects right away, Object Calisthenics were certainly very valuable as motivation to constantly improving ourselves and our code. A good example is rule 2 (“No else”) which gets even more attention from now on. And there are definitely one or two primitives and collections that get their own class during the next refactoring.

About breaking class contracts – fear clone()

Recently I had some discussions about copying of Objects in Java with some fellow developers. They were overriding clone() which I never felt neccessary. Shortly after I stumbled over a Checkstyle-Warning in our own code regarding clone() where overriding it is absolutely discouraged. Triggered by these two events I decided to dig a bit deeper into the issue.Climbing a Pile of Files

The bottom line is that Object.clone() has a defined contract which is very easy to break. This has to do with it’s interaction with the Cloneable interface which does not define a clone() method and the nature of Object’s clone implementation which is native.  Joshua Bloch names some problems and pitfalls with overriding clone in his excellent book Effective Java (Item 11):

  • “If you override the clone method in a nonfinal class, you shoud return an object obtained by invoking super.clone()”. A problem here is that this is never enforced.
  • “In practice, a class that implements Cloneable is expected to provide a properly functioning public clone method”. Again this is enforced nowhere.
  • “In effect, the clone method functions as another constructor; you must ensure that it does no harm to the original object and that it properly establishes invariants on the clone.”. This means paying extreme attention to the issue of shallow and deep copies. Also be sure not to forget possible side effects your constructors may have like registering the object as a listener.
  • “The clone architecture is incompatible with normal use of final fields referring to mutable objects”. You are sacrificing freedom in your class design because of flaw in the clone() concept.

He also provides better alternatives like copy constructors or copy factories if you really need object copying. I urge you to use one of the alternatives because breaking class contracts is evil and your classes may not work as expected. This one is easy to break. If you absolutely must implement a clone() method because you are subclassing an unchangeable cloneable class be sure to follow the rules. As a sidenote also be aware of the contract that hashCode() and equals() define.

Honey, I shrunk the build box

Meet the world’s smallest hudson server, operating with even less power than your energy saving light bulb.

We are currently posting an ongoing series on how to make your (hudson) build box faster. This article talks about making it smaller.

Making a build box as small as possible isn’t the most familiar requirement today and wasn’t for us. But when i privately bought a fit-PC2, we couldn’t resist trying it out as a hudson server.

The fit-PC2

mini1This is a computer that really fits everywhere. In your car, on the back of your monitor or just, as in our case, on a beer mat. It’s a fully equipped PC with the specification of a standard netbook (Atom 1.6GHz CPU, 1GB RAM, 160GB HDD) and the dimensions of a 5-port Ethernet switch. The most astonishing fact about it is that it uses standard size 2,5″ notebook harddisks. For more information about the computer, look around the CompuLab website, they do not exaggerate.

Operating the fit-PC2

mini2The fit-PC2 is a normal computer in every aspect. We run ours with Ubuntu linux and official Java packages from Sun. As the case is fanless, it accumulates some heat, but never over 60° Celsius (140° Fahrenheit). We measured an average temperature of 45°C on the case surface while building a large project. The Gnome desktop feels snappy, application load delays are sufficiently small and customizing the software outfit is as easy as it can get with Ubuntu.

Setting up hudson

Installing the hudson continuous integration server on a Debian based linux system is a matter of three commands. See Koshuke Kawaguchi’s blog entry on that topic for details. After the automatic installation procedure, hudson already runs on port 8080 of the machine. Setting up the project’s job and initiating the first build were a matter of a few minutes. Hudson reacts swiftly to website clicks.

The world’s smallest hudson server

mini3This is the smallest hudson instance we’ve heard of up to now. It runs in a case measuring 11,5 x 10,0 x 2,6 centimeters. The power consumption is around 8 Watt when building (including the self-usage of the measuring device itself), which would be even lower once we replace the mechanical harddisk with a solid state disk (SSD).

From the performance specifications given above, you should not expect a speed wonder. The fit-PC2 finished the project’s build within 09:50 minutes, which is dangerously near the ten-minutes mark for acceptable continuous feedback. So this box will not go into regular duty, but return home to me (remember, i bought it privately).

Conclusion

The whole purpose of this experiment was to get used to a new era of microcomputers. They are palm-sized and nearly battery operated, but fully equipped with standard components and powerful enough to perform regular tasks. The fit-PC2 is a strong instance of these devices.

Show off your hudson server

Well, to be honest, another purpose of this experiment was to show off our hudson skills, operating with hudson instances from heterogeneous slave farms to this single 300 cubic centimetres box. We would like to hear from your hudson instance. You may add a comment and/or share a link with your story. Maybe the hudson wiki is the ultimate place to gather all the stories.


P.S. This blog entry’s title is an adaption of my childhood’s favorite movie.

Our first Open Source Love Day (OSLD)

opensourcelovedayLast week we had our first OSLD and started with the usual hopes and fears:

  • how much time will be consumed setting the project up?
  • will we find issues that are small enough to be finished in one day?
  • the hope to learn something new: a new technology, language or tool
  • the hope to improve our skills in reading a different code style and a new codebase

As an additional challenge we tried to do object calisthenics (which we will covered in a different blog post). In short they are a list of coding guide lines which try to bend your mind to get more flexible when coding/designing software.

Start

After an initial meeting we decided to work on EGit/JGit. This project was on a familiar ground (Java) and had enough new tools (Git) and technologies (plugin development under Eclipse). Setup was very easy and fast (thanx to the EGit/JGit team!) and we started to look at the low hanging fruits linked from the EGit wiki which are basically issues that are categorized as easy. This is a very good idea for project newbies to start (in theory…). But the reality was that most of them were already patched and other had not enough information to make any sense for us (which could be also our lack of knowledge of the GIT wording and its internal concepts and function). At the end of the day we had worked on issues which were definitely too big for us (requiring changes in the infrastructure of JGit) and reported some issues as non-reproduceable.

Finish

So we learned a few things from the first day:

  • project setup can be very fast and easy
  • low hanging fruits are a very good idea
  • avoid infrastructure changes
  • a basic familiarity with the concepts involved is key to get along
  • don’t do too much on one day, instead: focus!
  • scratching your own itch would benefit the understanding of the issue and your motivation

So in the next month we give OSLD another chance and hope to learn even more.

Speed up your buildbox, Part I: Introduction & Harddisk

This is the first part of a series on how to boost your build box without much effort. This episode talks about the effects of different harddisks.

W© Friedberg - Fotolia.come actively use Hudson as our continuous integration server software. It has a nice little feature called “build history trend” that shows the duration of all archived builds. One of our major projects started out small and fast with a build duration of 01:20 minutes. One and a half year later, it reached for the 04:00 minute hurdle. It wasn’t a surprise to us, as the build has more than four times the work now and the hardware staid the same.

But a question emerged: How can we speed up our build?

Applying optimization: The basic maths

We did a quick review of our ant build scripts to ensure there’s nothing fundamentally wrong with them and then decided which road to follow first: Optimizing the build scripts or boosting the hardware? There is only one pragmatic answer to us: boost the hardware as long as it stays reasonable in price. Every optimization in the build script would need its time (which isn’t cheap) and possibly increase the script complexity (which is very expensive later on).

Optimizing the hardware

So we went on the journey to make a fast buildbox even faster. We started out with a dual core processor (2.6 GHz), a decent-but-standard harddisk and 4 GB of memory. We replaced every part on its own to see the effect. The journey includes:

Our goal is to cut our build time down by 50 percent, to a little less than 02:00 minutes. We don’t want to spend more than 500 EUR for new hardware. So now, after this introduction:

Part I: Replacing the harddisk

Our buildbox starts with a more or less normal harddisk (0.5 TB), certified for continuous usage. We could have bought just another normal harddisk of a newer generation, but that doesn’t cut it in our experience (we didn’t verify specifically, though).

Calling the carnivores

If you need to upgrade your harddisk, you can buy yourself a VelociRaptor drive and be pretty much assured that you’ll notice the difference. We had pleasant experiences with this kind of fast-spinning drives before, but this time, we wanted to go a step further and try a fast Solid State Disk (SSD). As you only need to relocate the working directories (called workspaces in hudson terminology) of your projects to the new disk, the capacity isn’t important as long as it’s greater than your project sizes. You can just plug your new disk in the buildbox and format it with a high performance file system. As our buildbox runs on Linux, relocating the workspace is just setting a symbolic link. You do not even tell hudson about it. If you happen to run on Windows, check out the “use custom workspace” setting on your job’s configuration page.

An investment of about 200 EUR and 15 minutes of installation later, we had the result: The build average before was 03:30 minutes and now 03:10 minutes. That’s not a big leap forward, as others found out, too. It’s not that the SSD was bad, it performed exceptionally well in the benchmarks, but the harddisk wasn’t the bottleneck. To further proof our assumption, we installed the fastest harddrive you can get: the RAM disk.

Only pretend to use the disk

Linux (like other unixoid systems) has the great feature of an emulated harddisk right in your memory. On Debian/Ubuntu systems, this emulated drive is mounted at /dev/shm and has a capacity of half your total physical memory. It grows dynamically, so you don’t have to worry about its initial size. But you have to check if your workspace fits into it. Our buildbox had 4 GB of RAM and 2 GB were enough to contain the hudson workspace. We configured hudson to build there (you can use symbolic links or the “custom workspace” setting as shown in the picture) and got the result: The build average went down to 02:50 minutes.

custom_workspace

Review on the results

That’s as far as we could speed up our buildbox by just replacing the harddisk. Down from 03:30 minutes to 02:50 minutes, a reduction of 40 seconds or 20 percent. In fact, we even cheated as the buildbox doesn’t use an harddisk anymore for building. With Linux, it’s incredibly easy to utilize a RAM disk as long as you have enough RAM to loan. For Windows systems, there are several software products that can do the same. If you don’t want to loan your RAM, you can look into HyperDrives, but for a price!

So we conclude that the fastest harddisk is an emulated one and even then, its effect on the build time is limited.

Stay tuned for the next episode of our journey to a faster buildbox, when we apply a faster CPU.

Smell if it’s well

Ever wondered how a code smell would taste like? We chose it to be like vanilla. Our latest extreme feedback device scents our office air depending on code quality.

We at the Softwareschneiderei are constantly searching for ways to gather feedback from our projects. We get feedback from our customers and their users, but we also get feedback directly from the code, be it through test results or code analysis. A great way to make your code speak for itself is to provide it some Extreme Feedback Devices (XFD).

IntroduciIMG_0574_smellng the Smell-O-Mat

One thing we always wanted to have was “code smells” that really smell for themselves. When we ran across an ultrasonic humidifier that can produce room-wide smells by dispensing essential oils, we found the right device for this feedback. We bought two humidifiers and labeled them “good” and “evil”. The hardest part was to find a smell everybody relates to “evil”, but won’t distract you too much from your work. Whenever our code analysis finds a new real code smell, the “evil” humidifier is turned on for some minutes. If an existing code smell is fixed, we get the “good” smell.

The effects

We do not produce code smells all too often. But once in a while, it happens. And this incident can now be perceived throughout the day just by breathing. On the other hand, fixing old smells is a source of refreshing air. Whenever the office atmosphere needs replenishment, all you have to do is to fix some code smells in our large code base (they do get rare!). Of course, most junior developers just open a window for that.

We chose grapefruit being our “good” smell, so our work area tastes mostly limony now instead of just “developer’s thoughts”, a fragrance that yet has to bottled.

The technical solution

Technically, the integration of the two humidifiers with our reporting infrastructure was very easy. Every XFD is controlled by an IRC bot that understands certain commands suitable for the device and hangs around at our central IRC server. As an humidifier only understands “on” and “off”, it could be controlled just like the ONOZ! lamp. We connected the humidifiers to a remote controlled power supply, switched it on and let the bot control the supply.

Our reporting infrastructure forwards its results to an aggregation software that interprets the numbers and produces IRC commands for the device bots. All of this is done with a combination of website scraping (Hudson as our continuous integration server has a wonderful XML API) and IRC messaging.

The history of XFD so far

Over the last years, we gathered XFDs for almost every human sense. We have visual effects, audible feedback using speech synthesis and even bought an USB rocket launcher for forced feedback needs. With the Smell-O-Mat, we can now deal with smelling, too.
The last human sense we have to address is tasting. Plans for the “coffee salter” were impeded by our sense of humanity. We keep searching.


Read more about our Extreme Feedback Devices:

A Small XML Builder in Ruby

From a C++ point of view, i.e. the statically typed world with no “dynamic” features that deserved the name, I guess you would all agree that languages like Groovy or Ruby are truly something completely different. Having strong C++ roots myself, my first Grails project gave me lots of eye openers on some nice “dynamic” possibilities. One of the pretty cool things I encountered there was the MarkupBuilder. With it you can just write XML as if it where normal Groovy Code. Simple and just downright awesome.

The other day in yet another C++ project I was again faced with the task to generate some XML from text file. And, sure enough, my thoughts wandered to the good days in the Grails project where I could just instantiate the MarkupBuilder… But wait! I remembered that a colleague had already done some scripting stuff with Ruby, so the language was already kind of introduced into the project. And despite the fact that it was a new language for him he did some heavy lifting with it in just no time (That sure does not come as a big surprise all you Ruby folks out there).

So if Ruby is such a cool language there must be something like a markup builder in it, right? Yes there is, well, sort of. Unfortunately, it’s not part of the language package and you first have to install a thing called gems to even install the XML builder package. Being in a project with tight guidelines when it comes to external dependencies and counting in the fact that we had no patience to first having to learn what Ruby gems even are, my colleague and I decided to hack our own small XML builder (and of course, just for the fun of it). I mean hey, it’s Ruby, everything is supposed to be easy in Ruby.

Damn right it is! Here is what we came up with in what was maybe an hour or so:

class XmlGen
   def initialize
      @xmlString = ""
      @indentStack = Array.new
   end

   def method_missing(tagId, attr = {})
      argList = attr.map { |key, value|
         "#{key}=\"#{value}\""
      }.reverse.join(' ')

      @xmlString << @indentStack.join('') 
      @xmlString << "<" << tagId.to_s << " " << argList
      if block_given?
         @xmlString << ">\n"
         @indentStack.push "\t"
         yield
         @indentStack.pop
         @xmlString << @indentStack.join('') << "</" << tagId.to_s << ">\n"
      else
         @xmlString << "/>\n"
      end
      self
   end

   def to_s
      @xmlString
   end
end

And here is how you can use it:

xml = XmlGen.new
xml.FirstXmlTag {
   xml.SubTagOne( {'attribute1' => 'value1'} ) {
      someCollection.each { |item|
         xml.CollectionTag( {'itemId' => item.id} )
      }
   }
}

It’s not perfect, it’s not optimized in any way and it may not even be the Ruby way. But hey, it served our needs perfectly, it was a pretty cool Ruby experience, and it sure is not the last piece of Ruby code in this project.

Always be aware of the charset encoding hell

Most developers already struggled with textual data from some third party system and getting garbage special characters and the like because of wrong character encodings.  Some days ago we encountered an obscure problem when it was possible to login into one of our apps from the computer with the password database running but not from other machines using the same db.  After diving into the problem we found out that they SHA-1 hashes generated from our app were slightly different. Looking at the code revealed that platform encoding was used and that lead to different results:platform-encoding

The apps were running on Windows XP and Windows 2k3 Server respectively and you would expect that it would not make much of a difference but in fact it did!

Lesson:

Always specify the encoding explicitly, when exchanging character data with any other system. Here are some examples:

  • String.getBytes(“utf-8”), new Printwriter(file, “ascii”) in Java
  • HTML-Forms with attribute accept-charset="ISO-8859-1"
  • In XML headers <?xml version="1.0" encoding="ISO-8859-15"?>
  • In your Database and/or JDBC driver
  • In your file format documentation
  • In LaTeX documents
  • everywhere where you can provide that info easily (e.g. as a comment in a config file)

Problems with character encodings seem to appear every once in a while either as end user, when your umlauts get garbled or as a programmer that has to deal with third party input like web forms or text files.

The text file rant

After stumbling over an encoding problem *again* I thought a bit about the whole issue and some of my thought manifested in this rant about text files. I do not want to blame our computer science predecessors for inventing and using restricted charsets like ASCII or iso8859. Nobody has forseen the rapid development of computers and their worldwide adoption and use in everyday life and thus need for an extensible charset (think of the addition of new symbols like the €), let aside performance and memory considerations. The problem I see with text files is that there is no standard way to describe the used encoding. Most text files just leave it to the user to guess what the encoding might be whereas almost all binary file formats feature some kind of defined header with metadata about the content, e.g. bit depth and compression method in image files. For text files you usually have to use heuristical tools which work  more or less depending on the input.

A standardized header for text files right from the start would have helped to indicate the encoding and possibly language or encoding version information of the text and many problems we have today would not exist. The encoding attribute in the XML header or the byte order mark in UTF-8 are workarounds for the fundamental problem of a missing text file header.

Give open source some love back!

Like many others our work is enabled by open source software. We make a heavy use of the several great open source projects out there. Since they help us doing our business and doing it in a productive way, we want to give some love aehmm work back. So we decided to dedicate one day per month to open source contributions. These can be bug fixes, new features, even documentation or bug reports. I believe that every contribution helps an open source project and many projects need help.
The whole development team will work on projects they like. One day per month does not sound much but I think even starting small helps. And maybe you can suggest a similar day in your company, too ?
Besides the obvious boost in developer motivation (and therefore productivity) there are several things your company will benefit from:

  • help in your own projects: fixing bugs in the open source projects you use is like fixing bugs in your own project
  • image for your company: being active in open source gives a better image regarding potential future employees and also shows responsibility in the field they work in
  • PR for your company and an edge over your competition: writing about your contributions and your insights in your company blog, remember: out teach your competition

So get your company to spend just one day per month or so for open source. It may not be much but every little bit helps!