Lazy Initialization/evaluation can help you performance and memory-consumption-wise

One way to improve performance when working with many objects or large data structures is lazy initialization or evaluation. The concept is to defer expensive operations to the latest moment possible – ideally never. I want to show some examples how to use lazy techniques in java and give you pointers to other languages where it is even easier and part of the core language.

One use case was a large JTable showing hundreds of domain objects consisting of meta-data and measured values. Initially our domain objects held both types of data in memory even though only part of the meta data was displayed in the table. Building the table took several seconds and we were limited to showing a few hundred entries at once. After some analysis we changed our implementation to roughly look like this:

public class DomainObject {
  private final DataParser parser;
  private final Map<String, String> header = new HashMap<>();
  private final List<Data> data = new ArrayList<>();

  public DomainObject(DataParser aParser) {
    parser = aParser;
  }

  public String getHeaderField(String name) {
    // Here we lazily parse and fill the header map
    if (header.isEmpty()) {
      header.addAll(parser.header());
    }
    return header.get(name);
  }
  public Iterable<Data> getMeasurementValues() {
    // again lazy-load and parse the data
    if (data.isEmpty()) {
      data.addAll(parser.measurements());
    }
    return data;
  }
}

This improved both the time to display the entries and the number of entries we could handle significantly. All the data loading and parsing was only done when someone wanted the details of a measurement and double-clicked an entry.

A situation where you get lazy evaluation in Java out-of-the-box is in conditional statements:

// lazy and fast because the expensive operation will only execute when needed
if (aCondition() && expensiveOperation()) { ... }

// slow order (still lazy evaluated!)
if (expensiveOperation() && aCondition()) { ... }

Persistence frameworks like Hibernate often times default to lazy-loading because database access and data transmission is quite costly in general.

Most functional languages are built around lazy evaluation of and their concept of functions as first class citizens and isolated/minimized side-effects support lazyness very well. Scala as a hybrid OO/functional language introduces the lazy keyword to simplify typical java-style lazy initialization code like above to something like this:

public class DomainObject(parser: DataParser) {
  // evaluated on first access
  private lazy val header = { parser.header() }

  def getHeaderField(name : String) : String = {
    header.get(name).getOrElse("")
  }

  // evaluated on first access
  lazy val measurementValues : Iterable[Data] = {
    parser.measurements()
  }
}

Conclusion
Lazy evaluation is nothing new or revolutionary but a very useful tool when dealing with large datasets or slow resources. There may be many situations where you can use it to improve performance or user experience using it.
The downsides are a bit of implementation cost if language support is poor (like in Java) and some cases where the application can feel more responsive with precomputed/prefetched values when the user wants to see the details.

Aspects done right: Concerns

With aspects you cannot see (without sophisticated IDE support) which class has which aspects and which aspects are woven into the class when looking at its source. Here concerns (also called mixins or traits) come to the rescue.

The idea of encapsulating cross cutting concerns struck with me from the beginning but the implementation namely the aspects lacked clarity in my opinion. With aspects you cannot see (without sophisticated IDE support) which class has which aspects and which aspects are woven into the class when looking at its source. Here concerns (also called mixins or traits) come to the rescue. I know that aspects were invented to hide away details about which code is included and where but I find it confusing and hard to trace without tool support.

Take a look at an example in Ruby:

module Versionable
  extend ActiveSupport::Concern

  included do
    attr_accessor :version
  end
end

class Document
  include Versionable
end

Now Document has a field version and is_a?(Versionable) returns true. For clients it looks like the field version is in Document itself. So for clients of this class it is the same as:

class Document
  attr_accessor :version
end

Furthermore you can easily use the versionable concern in another class. This sounds like a great implementation of the separating of concerns principle but why isn’t everyone using it (besides being a standard for the upcoming Rails 4)? Well, some people are concerned with concerns (excuse the pun). As with every powerful feature you can shoot yourself in the foot. Let’s take a look at each problem.

  • Diamond problem aka multiple inheritance
  • Ruby has no multiple inheritance. Even when you include more than one module the modules are like superclasses for the message resolve order. Every include creates a new “superclass” above the including class. So the last include takes precedence.

  • Dependencies between concerns
  • You can have dependencies between different concerns like this concern needs another concern. ActiveSupport:Concerns handles these dependencies automatically.

  • Unforeseeable results
  • One last big problem with concerns is having side effects from combining two concerns. Take for an example two concerns which add a method with the same name. Including both of them renders one concern unusable. This cannot be solved technically but I also think this problem shows an underlying, more important cause. It could be because of poor naming. Or you did not separate these two concerns enough. As always tests can help to isolate and spot the problem. Also concerns should be tested in isolation and in integration.

Summary of the Schneide Dev Brunch at 2013-01-06

If you couldn’t attend the Schneide Dev Brunch in January 2013, here are the main topics we discussed neatly summarized.

brunch64-borderedYesterday, we held another Schneide Dev Brunch. The Dev Brunch is a regular brunch on a sunday, only that all attendees want to talk about software development and various other topics. If you bring a software-related topic along with your food, everyone has something to share. The brunch had less participants this time, but didn’t lack topics. Let’s have a look at the main topics we discussed:

Sharing code between projects

The first topic emerged from our initial general chatter. What’s a reasonable and praticable approach to share code between software entities (different projects, product editions, versions, etc.). We discussed at least three different solutions that are known to us in practice:

  • Main branch with customer forks: This was the easiest approach to explain. A product has a main branch where all the new features are committed to. Everytime a customer wants his version, a new branch is created from the most current version on the main branch. The customer may require some changes and a lot of bug fixes, but all of that is done on the customer’s branch. Sometimes, a critical bug fix is merged back into the main branch, but no change from the main branch is transferred to the customer’s branch ever. Basically, the customer version of the code is “frozen” in terms of features and updates. This works well in its context because the main branch already contains the software every customer wants and no customer wants to update to a version with more features – this would be another additional branch.
  • Big blob of conditionals: This approach needs a bit more explanation. Once, there was a software product ready to be sold. Every customer had some change requests and special requirements. All these changes and special-cases were added to the original code base, using customer IDs and a whole lot of if-else statements to separate the changes from each customer. All customers always get the same code, but their unique customer ID only passes the guard clauses that are required for them. All the changes of all the other customers are deactivated at runtime. With this approach, the union of all features is always represented in the source code.
  • Project-as-an-universe: This approach defines projects as little universes without intersection. Every project stands for its own and only shares code with other projects by means of copy and paste. When a new project is started, some subset of classes of another project is chosen as a starting point and transformed to fit the requirements. There is no “master universe” or main branch for the shared classes. The same class may evolve differently (and conflicting) in different projects. This approach probably isn’t suited for a software product, but is applied to individual projects with different requirements.

We are aware of and discussed even approaches, but not with the profound knowledge of several years first-hand experience. The term OSGi was often used as a reference in the discussion. We were able to exhibit the motivation, advantages and shortcomings of each approach. It was very interesting to see that even slightly different prerequisites may lead to fundamentally different solutions.

Book (p)review: Practical API Design

In the book “Practical API Design” by Jaroslav Tunach, the founder of the NetBeans Platform and initial “architect” of its API talks about his lessons learnt when evolving a substantial API for over ten years. The book begins with a theory on values and motivations for good API design. We get a primer why APIs are needed and essential for modern software development. We learn what are the essential characteristics of a good API. The most important message here is that a good API isn’t necessarily “beautiful”. This caused a bit of discussion among us, so that the topic strayed a bit from the review characteristic. Well, that’s what the Dev Brunch is for – we aren’t a lecture session. One interesting discussion trail led us to the aestethics in music theory.
But to give a summary on the first chapters of the book: Good stuff! Jaroslav Tunach makes some statements worthy of discussion, but he definitely knows what he’s talking about. Some insights were eye-openers or at least thought-provokers for our reader. If the rest of the book holds to the quality of the first chapters, then you shouldn’t hesitate to add it to your reading queue.

Effective electronic archive

One of our participants has developed a habit to archivate most things electronically. He already blogged about his experiences:

Both blog entries hold quite a lot of useful information. We discussed some possibilities to implement different archivation strategies. Evernote was mentioned often in the discussion, diigo was named as the better delicious, Remember The Milk as a task/notification service and Google Gmail as an example to rely solely on tags. Tags were a big topic in our discussion, too. It was mentioned that Confluence has the ability to add multiple tags to an article. Thunderbird was mentioned, especially in the combination of tags and virtual folders. And a noteworthy podcast of Scott Hanselmann on the topic of “Getting Things Done” was pointed out, too.

Schneide Events 2013

We performed a short survey about different special events and workshops that may happen in 2013 in the Softwareschneiderei. If you already are registered on our Dev Brunch list, you’ll receive the invitations for all events shortly. Here is a short primer on what we’re planning:

  • Communication Through Test workshop
  • Refactoring Golf
  • API Design Fest
  • Google Gruyere Day
  • Introduction to Dwarf Fortress

Some of these events are more related to software engineering than others, but all of them try to be fun first, lessons later. Participate if you are interested!

Learning programming languages

The last main topic of the brunch was a short, rather disappointed review of the book “Seven Languages in Seven Weeks” by Bruce Tate. The best part of the book, according to our reviewer, were the interview sections with the language designers. And because he got interested in this kind of approach to a programming language, he dug up some similar content:

The Computerworld interviews are directly accessible and contain some pearls of wisdom and humour (and some slight inaccuracies). Highly recommended reading if you want to know not only about the language, but also about the context (and mindset) in which it was created.

Epilogue

As usual, the Dev Brunch contained a lot more chatter and talk than listed here. The high number of attendees makes for an unique experience every time. We are looking forward to the next Dev Brunch at the Softwareschneiderei. And as always, we are open for guests and future regulars. Just drop us a notice and we’ll invite you over next time.