Miq – Page 16 – Schneide Blog

Looking left and right will improve you as a developer

After initial encounters with computers and programming I pretty much settled with Java as my preferred Language and especially platform. Occasional adventures to C/C++ or other languages and tools do happen but not on a day to day basis. We are mostly a Java shop so this seems natural. However, I strongly suggest looking left and right and trying new stuff, be it a programming language, an operating system, an IDE or a programming framework. Similarily to travelling around in the real world™ it will widen your views on your daily work and give you new ideas on how to improve it. You will try to adapt new good stuff from elsewhere and on the other hand appreciate nice aspects of your current environment more.

Let me give some examples to support my point and motivate adventures outside your home turf. I have been playing around with Scala and therefore functional programming in the past months. One major benefit of these experiments was my new appreciation for immutable types and side effect free code. You can carry them over to many programming environments including Java often making your code easier to test and less error prone. Object-oriented programming (OOP) relies heavily on objects with state and side effects but there are many places where immutability reduces tracking effort of objects and their state. A nice example in Java is the Joda Time library with provides immutable DateTime-classes in contrast to java.util.Date et al. Null-Handling in Scala using the Option-Type seems so interesting that some people try to carry it over to Java as an alternative to Null-Objects or null checks. The rich collection classes and implicit conversions in Scala may encourage you write own utility classes for Java collections, nice wrappers or use alternative collection frameworks to make your life easier. In general, I find wrapping a nice, standard OO-technique often underused outside of frameworks.

You do not always have to stray that far. Some frameworks like Fest and EasyMock show nice usages of fluent interfaces. Why not use this technique in own code? I found them especially useful for implementing builders for complex or highly configurable objects. Fluent interfaces can make your code look a lot like natural language resulting in expressive and highly readably code.

Using Mac OS X with Spotlight™ and TimeMachine™ may make you look for similiar features on your Linux Box (e.g. gDeskbar and BackInTime) or Windows (depending on version available through third party software or built-in). Using the multiple deskops of Linux (or another OS with an X11-Window system) may motivate you to try them in your other working environments (they suck on OS X, though).

Trying different IDEs may increase your effectiveness depending on the language and frameworks used. Grails support may be better in IntelliJ whereas you may like Eclipse more for plain J2SE projects. Sometimes you will find some cool feature in one IDE you were missing before. I encourage you to go back to your environment and look for the same or similar features. Often you will find them in many advanced IDEs.

Without seeing what is possible you may never miss it. But beware thinking everything new is automatically better. The grass is always greener on the other side. Try to reflect on the things you learn and encounter. Take the good parts to improve existing stuff whenever sensible. But also do not fear to move on if it is time.

For me and my colleagues this wandering around in this rich software development world has proven very valuable to continually improve our style and increase our productivity. These adventures in foreign waters coupled with reading books, dev brunches and attending talks on and offline keeps our skills fresh and improving. Sometimes it may even lead to own ideas, APIs and tools.

Step-by-step tutorials and manuals are priceless

Writing documentation in and outside of code annoys most developers. I am no different but after several years in the software development business I have come to value one type of documentation a lot: step-by-step tutorials and manuals. Surprisingly, they are often missing. I want to depict some common use cases where such documents have proven valueable many times:

when learning new tools and frameworks

Most people I know learn best by example. A tutorial with step-by-step explanation of how to setup the tool and then some good examples brings you up to speed quickly. The Lift framework has such a thing and gets you a webapp running in minutes. But when you dive deeper you may find much documentation and how-tos are missing. Books may help but most of the time it is faster and easier to look for examples on the web and in project wikis.

Also beware, that bad examples may cause a lot of frustration (typos are poison for newbies) and teach bad style or obsolete techniques. Keep the tutorials up-to-date (I know, thats one problem of documentation….) and as accurate as possible and many people will have a better time working with the tools and frameworks.

The same is true for developers in your team: Many will look for examples in the existing code base and learn from them be they good or bad. So keep the code clean and full of good examples. Let your seniors spread them in the teams. Do not forget to provide good examples and best practices in your wiki or other documentation system.

when performing manual tasks

We often have to manage and deploy production systems for our customers. Even though many tasks are automated by scripts and other tools you sometimes need to perform manual tasks. Especially in an stress situation at customer site or when otherwise dealing with production systems it is extremely helpful to have a simple and precise guide for the task at hand. There are enough unknowns and problems that may need brain power so the basic tasks and procedures should be well documented. That way you have one less thing to worry about and are ready to face potential upcoming problems. A good guide gives you much needed confidence.

when trying to work on an open source project

We regularily (see our OSLDs) dive into some open source project to improve it. Nothing is more frustrating than spending more than half a day with setting up the development environment. Overboarding dependencies, works-on-my-machine-style build scripts and missing documentation and tutorials prevent a productive experience. That hurts the themselves projects by driving away potential contributors. We had several such experiences but also many positive ones like Hudson plugin development or EGit where you can get up to work in minutes and perform your first monkey see – monkey do experiments.

Conclusion

Much documentation generated or written nowadays is not worth 2 cents. API-Docs which list the classes, methods and fields and no additional info and descriptions provide no use and just waste the time of people trying to find information there. Highlevel blabla or theoretical dissertations are all nice but do not help much getting the job done (at least not in the beginning). But small guides and tutorials written to the point really do make a difference regardless if they are written for inhouse work and development or for openly available projects. Choose the kind of documentation you write wisely.

About PrintStream and Exceptions

Several of our projects deal with sensor hardware of different types often connected via the good old™ serial port. That is fine most of the time because most protocols are simple and RXTX provides a nice cross-platform library for most of your serial port needs. But many new computers do not feature the old RS232 serial ports anymore or other contraints prevent the use of a plain RS232 serial port. Here come serial converters like the Advantech ADAM 4570 (serial-to ethernet) or usb-to-serial converters into play. Usually this works fine.

Now one of our customers had a test system using an unreliable converter with sensor hardware. The hardware problems uncovered a robustness issue in our software which crashed the JVM when the virtual serial port of the converter disappeared and our app tried to write to it. Despite the faulty hardware our software had to be robust because it manages many more devices other than just that one sensor over serial. Looking at the problem we discovered that the crash occurred somewhere in the native part of RXTX. So we decided to scratch our own itch (and the one of the customer) and set out to fix the issue in RXTX at a Open Source Love Day (OSLD) . So we fixed the problem and submitted the patch to the bugtracker of the RXTX project. Our sample program now worked flawlessly and threw an IOException when the serial port failed in some way.

Happy to have fixed the problem we incorporated the patch RXTX in our production software but it still crashed and no IOException appeared anywhere in the logs. After another bughunting session we spotted the subtle difference of sample and production program: the use of OutputStream insted of PrintStream. PrintStream silently swallows all exceptions which proved fatal in our use case with the unreliable stream carrier. So the final fix was essentially replacing our PrintStream code

RXTXPort port = new RXTXPort("COM6");
PrintStream p = new PrintStream(port.getOutputStream(), true, "iso8859");
p.print("command");

with using OutputStream directly:

RXTXPort port = new RXTXPort("COM6");
OutputStream o = port.getOutputStream();
o.write("command".getBytes("iso8859"));

Conclusion

Be careful when using PrintStream with unreliable stream carriers it swallows exceptions! That may shadow problems which you may want to know of. Often PrintStreams behaviour will not be a problem but in certain cases like the one depicted above it causes a lot of headaches.

Database Versioning with Liquibase

In my experience software developers and database people do not fit too well together. The database guys like to think of their database as a solid piece and dislike changing the schema. In an ideal world the schema is fixed for all time.

Software developers on the other hand tend to think about everything as a subject to change. This is even more true for agile teams embracing refactoring. Liquibase is a tool making database refactorings feasible and revertable. For the cost of only one additional jar-file you get a very flexible tool for migrating from one schema version to another.

Using Liquibase

You formulate the changes in XML, plain SQL or even as custom java migration classes. If you are careful and sometimes provide additional information your changes can be made rollbackable so that changing between schema revisions becomes a breeze.
To apply the changes you simply run the liquibase.jar as a standalone java application. You can specify tags to update or rollback to or the count of changesets to apply. This allows putting the database in an arbitrary state within changeset granularity.

Additional benefits

An important benefit of Liquibase is that you can easily put all your changesets under version control so that they are managed exactly the same as the rest of the application.
Liquibase stores the changelog directly in the database in a table called databasechangelog. This enables the developer and the application to check the schema revision of the database and thus find inkonsistent states much easier.

Conclusion
All of the above is especially useful when multiple installations or development/test databases with different verions of the software and therefore database have to be used at the same time. Tracking the changes to the database in the repository and having a small cross platform tool to apply them is priceless in many situations.

Poor man’s TimeMachine

Some weeks ago I wrote about a easy and cheap backup solution for windows users. But what about Mac and Linux users? The Mac guys have a similar solution right at hand: TimeMachine. It is quite easy to backup the most important stuff regularily onto an external drive while working. The configuration and hardware investment is minimal.

Now what if I happen to use Linux as an operating system? I looked for solutions similar to the Seagate Replica or TimeMachine expecting less comfort. My first try was rsnapshot because a friend of mine recommended it. While it works nicely and has quite some features it requires manual editing of text configuration files. Nothing, that a casual user would like and even I was not quite satisfied. A little more research on the web brought me to Back in time.

Back in time was exactly what I wanted: simple install from the Ubuntu package repository, a GNOME gui (KDE version is available too) to configure and maintain everything and unobstrusive background operation. You can configure it even to run with root priviledges to backup files the logged in user cannot access. So you can keep system configuration files etc. backed up, too.

One hint for ubuntu users: You may need to install the “menu”-package to be able to use the root version.

Conclusion

With these backup solutions available for all major operating systems one can achive basic data security at virtually no cost. There is no compelling reason to risk many hours of work to a drive failure or a user delete without undelete possibilies (think rm -rf *). Of course one can improve that backup strategy further, but for me this is a baseline nobody should miss.

FindBugs-driven bughunting in legacy projects

I have been working on a >100k lines legacy project for a while now. We have to juggle customer requests, bug fixes and refactoring so it is hard to improve the quality and employ new techniques or tools while keeping the software running and the clients happy. Initially there were no unit tests and most of the code had a gigantic cyclomatic complexity. Over the course of time we managed to put the system under continuous integration, employed quite some unit tests and analyzed code “hotspots” and our progress with crap4j.

Normally we get bug reports from our userbase or have to test manually to find bugs. A few weeks ago I tried a new approach to bughunting in legacy projects using FindBugs. Many of you surely know this useful tool, so I just want to describe my experiences in that project using FindBugs. Many of the bugs may be in parts of the application which are seldom used or only appear in hard to reproduce circumstances. First a short list of what I encountered and how I dealt with it.

Interesting found bugs in the project

There was a calculation using an integer division but returning a double. So the actual computation result was wrong but yet the error would have been hard to catch because people rarely recalculate results of a computer. When writing the test associated to the bugfix I found a StackOverFlowError too!
There were quite some null dereferences found, often in contructs like
```
 if (s == null && s.length() == 0)
 
```
instead of
```
if (s == null || s.length() == 0)
```
which could be simplified or rewritten anyway. Sometimes there were possibilities for null dereferences on some paths despite of several null checks in the code.
Many performance bugs which may or may not have an effect on overall performance of the system like: new String(), new Integer(12), string concatenation across loops, inefficient usage of java.util.Map.keySet() instead of java.util.Map.entrySet() etc.
Some dead stores of local variables and statements without effect which could be thrown away or be corrected to do the intended things.

Things you may want to ignore

There are of course some bugs that you may ignore for now because you know that it is a common pattern in the team and abuse and thus errors are extremely unlikely. I, for example, opted to ignore some dozens of “may expose internal representation” found bugs regarding arrays in interfaces or accessibly via getters because it is a common pattern on the team not to tamper existing arrays as they are seen as immutable by the team members. It would have taken too much time to fix all those without that much of a benefit.

You may opt to ignore the performance bugs too but they are usually easy to fix.

Tips

If you have many foundbugs, fix the easy ones to be able to see the important ones more easily.
Ignore certain bug categories for now, fix them later, when you stumble upon them.
Concentrate on the ones that lead to wrong behaviour and crashes of your application.
Try to reproduce the problem with unit test and then fix the code whenever feasible! Tests are great to expose the bug and fix it without unwanted regressions!
Many bugs appear in places which need refactoring anyway so here is your chance to catch several flies at once.

Conclusion

With FindBugs you can find common programming errors sprinkled across the whole application in places where you probably would not have looked for years. It can help you to understand some common patterns of your team members and help you all to improve your code quality. Sometimes it even finds some hard to spot errors like the integer computation or null dereferences on certain paths. This is even more true in entangled legacy projects without proper test coverage.

SSD and (One)-touch Backup solution

As explained a while ago we (developers) get an annual creativity budget. This time I decided to improve my notebook working experience and reliability by introducing two new items:

A fast SSD replacing the conventional relatively slow 2,5″ hard disk
An one-touch backup solution which in fact is a no touch solution

The SSD is a X25-m from Intel with 160Gb and the backup solution is a Seagate Replica with 500Gb disk space. Although there are recurring problems with the firmware and toolbox software the Intel SSD seemed to be the best choice price/performance/reliability wise. To be on a safer side data wise we paired it with the backup solution. Let me first explain the migration which went really smooth and was the first stress test for the backup system. The steps were the following:

Backup the existing system with the replica which does not require any user interaction after the client backup software is automatically installed
replace the original harddisk with the SSD
reboot the system with the recovery CD of the replica solution and restore the backed up system
reboot the recovered system from the SSD

The whole process went really smooth and only took some hours of data copying. There were no hickups whatsoever. After booting from the SSD my system was exactly like before, so the replica already proved that it really works even in the worst case of a complete drive loss.

The performance of the whole system is noticable better especially at system and application startup as you would expect.

Conclusion

The backup solution is so damn easy to use that I would recommend it to all people running Windows and caring about the data on their system. To keep your backup up to date just plug the external hard drive in a free USB port and continue working. You don’t have to do any configuration and other hassles which often end any effort of deploying a working backup solution. This is even more true for private people who do not have the knowledge to fiddle with system details. So go for a “one touch backup” if you do not have some working solution in use already!

A modern SSD can really improve your working experience especially on notebooks where hard disk performance is far worse than in an workstation environment. So older hardware can get new life and make your life easier and more productive.

About breaking class contracts – fear clone()

Recently I had some discussions about copying of Objects in Java with some fellow developers. They were overriding clone() which I never felt neccessary. Shortly after I stumbled over a Checkstyle-Warning in our own code regarding clone() where overriding it is absolutely discouraged. Triggered by these two events I decided to dig a bit deeper into the issue.

The bottom line is that Object.clone() has a defined contract which is very easy to break. This has to do with it’s interaction with the Cloneable interface which does not define a clone() method and the nature of Object’s clone implementation which is native. Joshua Bloch names some problems and pitfalls with overriding clone in his excellent book Effective Java (Item 11):

“If you override the clone method in a nonfinal class, you shoud return an object obtained by invoking super.clone()”. A problem here is that this is never enforced.
“In practice, a class that implements Cloneable is expected to provide a properly functioning public clone method”. Again this is enforced nowhere.
“In effect, the clone method functions as another constructor; you must ensure that it does no harm to the original object and that it properly establishes invariants on the clone.”. This means paying extreme attention to the issue of shallow and deep copies. Also be sure not to forget possible side effects your constructors may have like registering the object as a listener.
“The clone architecture is incompatible with normal use of final fields referring to mutable objects”. You are sacrificing freedom in your class design because of flaw in the clone() concept.

He also provides better alternatives like copy constructors or copy factories if you really need object copying. I urge you to use one of the alternatives because breaking class contracts is evil and your classes may not work as expected. This one is easy to break. If you absolutely must implement a clone() method because you are subclassing an unchangeable cloneable class be sure to follow the rules. As a sidenote also be aware of the contract that hashCode() and equals() define.

Always be aware of the charset encoding hell

Most developers already struggled with textual data from some third party system and getting garbage special characters and the like because of wrong character encodings. Some days ago we encountered an obscure problem when it was possible to login into one of our apps from the computer with the password database running but not from other machines using the same db. After diving into the problem we found out that they SHA-1 hashes generated from our app were slightly different. Looking at the code revealed that platform encoding was used and that lead to different results:

The apps were running on Windows XP and Windows 2k3 Server respectively and you would expect that it would not make much of a difference but in fact it did!

Lesson:

Always specify the encoding explicitly, when exchanging character data with any other system. Here are some examples:

String.getBytes(“utf-8”), new Printwriter(file, “ascii”) in Java
HTML-Forms with attribute accept-charset="ISO-8859-1"
In XML headers <?xml version="1.0" encoding="ISO-8859-15"?>
In your Database and/or JDBC driver
In your file format documentation
In LaTeX documents
everywhere where you can provide that info easily (e.g. as a comment in a config file)

Problems with character encodings seem to appear every once in a while either as end user, when your umlauts get garbled or as a programmer that has to deal with third party input like web forms or text files.

The text file rant

After stumbling over an encoding problem *again* I thought a bit about the whole issue and some of my thought manifested in this rant about text files. I do not want to blame our computer science predecessors for inventing and using restricted charsets like ASCII or iso8859. Nobody has forseen the rapid development of computers and their worldwide adoption and use in everyday life and thus need for an extensible charset (think of the addition of new symbols like the €), let aside performance and memory considerations. The problem I see with text files is that there is no standard way to describe the used encoding. Most text files just leave it to the user to guess what the encoding might be whereas almost all binary file formats feature some kind of defined header with metadata about the content, e.g. bit depth and compression method in image files. For text files you usually have to use heuristical tools which work more or less depending on the input.

A standardized header for text files right from the start would have helped to indicate the encoding and possibly language or encoding version information of the text and many problems we have today would not exist. The encoding attribute in the XML header or the byte order mark in UTF-8 are workarounds for the fundamental problem of a missing text file header.

Branching support missing in JIRA

JIRA is a really great tool when it comes to issue tracking. It is powerful, extensible, usable and widespread. We and many of our clients use it for years and are quite satisfied coming from other tools like Bugzilla.

One thing we find is missing though is the concept of branching. In JIRA you have projects and can define a roadmap consisting of versions that follow each other. Many software products require sooner or later a development line which is only maintained getting bug and security fixes while feature development happens on a separate branch.

Tracking issues for a software with different branches is a bit more demanding because you have to identify the branches one issue affects and possibly separately fix them on each branch. One and the same issue could have different resolutions per branch, i.e. invalid if a broken functionality does not exist anymore. Nevertheless, all branches have to be checked, likely in different time frames.

Unfortunately JIRA does not support the notion of branches. You have to emulate the behaviour using different schemes like:

One issue per version that represents a branch
Multiple fix versions for an issue
Subtasks for an issue or issue links to ckeck and fix the issue for different versions

To me all those are workarounds which lack polished usability and add overhead to your issue management. Real branching support could help you check on which branch an issue is done, where it has still to be resolved and so on without adding more and more (perhaps loosely connected) issues to your system or forgetting to fix the issue on some branch (no question/warning when resolving an issue with multiple fix versions).

[Update:]

I created a new feature request for JIRA where you can vote or track progress on this issue.

	Anonymous on Cache configuration with WildF…
	Miq on Nested queries like N+1 in pra…
	mariuselvert on Creating functors with lambda…
	Nested queries like… on Common SQL Performance Gotchas…
	Nested queries like… on Make your users happy by not c…