Be(a)ware of Laziness

Let’s assume we have a simple JavaScript “class” called Module. Each instance of the class has a name, a start() method and a stop() method to manage its lifecycle:

function Module(name) {
    this.name = name;
    console.log("Creating " + this.name);
}
Module.prototype.start = function() {
    console.log("Starting " + this.name);
};
Module.prototype.stop = function() {
    console.log("Stopping " + this.name);
};

We want to create a couple of instances with the names “a”, “b” and “c”. At the beginning of the program we want to start each module, and at the end of the program we want to stop each module. For the creation of the instances we use a map() function call on the names array:

var names = ["a", "b", "c"];
var modules = names.map(function(name) {
    return new Module(name);
});
modules.forEach(function(module) {
    module.start();
});
// do something
modules.forEach(function(module) {
    module.stop();
});

The output is as intended:

Creating a
Creating b
Creating c
Starting a
Starting b
Starting c
Stopping a
Stopping b
Stopping c

Now we want to port this code to C#. The definition of the class is straight-forward:

class Module
{
    private readonly String name;

    public Module(string name)
    {
        this.name = name;
        Console.WriteLine("Creating " + name);
    }

    public void Start()
    {
        Console.WriteLine("Starting " + name);
    }

    public void Stop()
    {
        Console.WriteLine("Stopping " + name);
    }
}

The map() function is called Select() in .NET:

var names = new List<string>{"a", "b", "c"};
var modules = names.Select(
                 name => new Module(name));

foreach (var module in modules)
{
    module.Start();
}

foreach (var module in modules)
{
    module.Stop();
}

But when we run this program, we get a completely different output:

Creating a
Starting a
Creating b
Starting b
Creating c
Starting c
Creating a
Stopping a
Creating b
Stopping b
Creating c
Stopping c

Each module is created twice, and the creation calls are interleaved with the start() and stop() calls.

What has happened?

The answer is that .NET’s Select() method does lazy evaluation. It does not return a new list with the mapped elements. It returns an IEnumerable instead, which evaluates each mapping operation only when needed. This is a very useful concept. It allows for the chaining of multiple operations without creating an intermediate list each time. It also allows for operations on infinite sequences.

But in our case it’s not what we want. The stopped instances are not the same as the started instances.

How can we fix it?

By appending a .ToList() call after the .Select() call:

var modules = names.Select(
        name => new Module(name)).ToList();

Now the IEnumerable gets evaluated and collected into a list before the assignment to the modules variable.

So be aware of whether your programming language or framework uses lazy or eager evaluation for functional collection operations to avoid running into subtle bugs. Other examples of tools based on the concept of lazy evaluation are the Java stream API or the Haskell programming language. Some languages support both, for example Ruby since version 2.0:

range.collect { |x| x*x }
range.lazy.collect { |x| x*x }

A small example of domain analysis

One thing I’ve learned a lot about in recent years is domain analysis and domain modeling. Every once in a while, an isolated piece of code or a separable concept shows me just how much I’ve missed out all the years before. A few weeks ago, I came across such an example and want to share the experience and insight. It’s a story about domain exploration with heightened degree of difficulty – another programmer had analyzed it before and written code that I should replace. But first, let’s talk about the domain.

The domain

04250The project consisted of a machine control software that receives commands and alters the state of a complex electronic circuitry accordingly. The circuitry consists of several digital-to-analog converters (DAC), among other parts. We will concentrate on the DACs in this story. In case you don’t know what a DAC is, let me explain. Imagine a little integrated circuit (IC), the black bug-like electronic parts on a circuit board. On one side, you provide it a digital number in binary representation and on the other side, you’ll get an analog voltage that represents your number. Let’s say you drive a 8-bit DAC and give it a digital zero, the output will be zero volt. If you give the same DAC the number 255, it will output the maximum possible voltage. This voltage is given by the “reference voltage” pin and is usually tied to 5 V in traditional TTL logic circuits. If you drive a 12-bit DAC, the zero will still yield 0 V, while the 255 will now only yield about 0,3 V because the maximum digital number is now 4095. So the resolution of a DAC, given in bits, is a big deal for the driver.

DAC0800How exactly you have to provide that digital number, what additional signals need to be set or cleared to really get the analog voltage is up to the specific type of DAC. So this is the part of behaviour that should be encapsulated inside a DAC class. The rest of the software should only be able to change the digital number using a method on a particular DAC object. That’s our modeling task.

The original implementation

My job was not to develop the machine control software from scratch, but re-engineer it from existing sources. The code is written in plain C by an electronics technician, and it really shows. For our DAC driver, there was a function that took one argument – an integer value that will be written to the DAC. If the client code was lazy enough to not check the bounds of the DAC, you would see all kinds of overflow effects. It worked, but only if the client code knew about the resolution of the DAC and checked the bounds. One task the machine control software needed to do was to translate the command parameters that were given in millivolts to the correct integer number to feed it into the DAC and receive the desired millivolts at the analog output pin. This calculation, albeit not very complicated, was duplicated all over the place.


writeDAC(int value);

My original translation

One primary aspect when doing re-engineering work is not to assume too much and don’t change too many places at once. So my first translation was a method on the DAC objects requiring the exact integer value that should be written. The method would internally check for the valid value range because the object knows about the DAC resolution, while the client code should subsequently lose this knowledge. The original code translated nicely to this new structure and worked correctly, but I wasn’t happy with it. To provide the correct integer value, the client code needs to know about the DAC resolution and perform the calculation from millivolts to DAC value. Even if you centralize the calculation, there are still calls from everywhere to it.


dac.write(int value);

My first relevation

When I finally had translated all existing code, I knew that every single call to the DAC got their parameter in millivolts, but needed to set the DAC integer. Now I knew that the client code never cared about DAC integers at all, it cared about millivolts. If you find such a revelation, act on it – even just to see where it might lead you to. I acted and replaced the integer parameter of the write method on the DAC object with a voltage parameter. I created the Voltage domain type and had it expose factory methods to be easily created from millivolts that were represented by integers in the commands that the machine control software received. Now the client code only needed to create a Voltage object and pass it to the DAC to have that voltage show up at the analog output pin. The whole calculation and checking part happened inside the DAC object, where it belongs.


dac.write(Voltage required);

This version of the code was easy to read, easy to reason about and worked like a charm. It went into production and could be the end of the story.

The second insight

But the customer had other plans. He replaced parts of the original circuitry and upgraded most of the DACs on the way. Now there was only one type of DAC, but with additional amplifier functionality for some output pins (a typical DAC has several output pins that can be controlled by a pin address that is provided alongside the digital number). The code needed to drive the DACs, that were bound to 5 V reference voltage, but some channels would be amplified to double the voltage, providing a voltage range from 0 V to 10 V. If you want to set one of those channels to 5 V output voltage, you need to write half the maximum number to it. If the DAC has 12-bit resolution, you need to write 2047 (or 2048, depending on your rounding strategy) to it. Writing 4095 would yield 10 V on those channels.

Because the amplification isn’t part of the DAC itself, the DAC code shouldn’t know about it. This knowledge should be placed in a wrapper layer around the DAC objects, taking the voltage parameters from the client code and changing it according to the amplification of the channel. The client code would want to write 10 V, pass it to the wrapper layer that knows about the amplification and reduces it to 5 V, passing this to the DAC object that transforms it to the maximum reference voltage (5 V) that subsequently gets amplified to 10 V. This sounded so weird that I decided to review my domain analysis.

It dawned on me that the DAC domain never really cared about millivolts or voltages. Sure, the output will be a specific voltage, but it will be relative to your input in relation to the maximum value. The output voltage has the same percentage of the maximum value as the input value. It’s all about ratios. The DAC should always demand a percentage from the client code, not a voltage. This way, you can actually give it the ratio of anything and it will express this ratio as a voltage compared to the reference voltage. The DAC is defined by its core characteristics and the wrapper layer performs the translation from required voltage to percentage. In case of amplification, it is accounted for in this translation – the DAC never needs to know.


dac.write(Percentage required);

Expressiveness of the new concept

Now we can really describe in code what actually happens: A command arrives, requiring us to set a DAC channel to 8 volt. We create the voltage object for 8 volt and pass it on to the DAC wrapper layer. The layer knows about the 2x amplification and the reference voltage. It calculates that 8 volt will be 80% of the maximum DAC value (80% of 5 V being 4 V before and 8 V after amplification) and passes this information to the DAC object. The DAC object, being the only one to know its resolution, sets 0.8 * maximum_DAC_value to the required register and everything works.

The new concept of percentages decouples the voltage information from the DAC resolution information and keeps both informations where they belong. In fact, the DAC chip never really knows about the reference voltage, either – it’s the circuit around it that knows.

Conclusion

While it is easy to see why the first version with voltages as parameters has its charms, it isn’t modeling the reality accurately and therefor falls short when flexibility is required. The first version ties DAC resolution and reference voltage together when in fact the DAC chip only knows the resolution. You can operate the chip with any reference voltage within a valid range. By decoupling those informations and moving the knowledge about reference voltages outside the DAC object, I modeled the reality more accurate and every requirement finds its natural place. This “natural place finding” is what makes a good model useful for reasoning. In our case, the natural place for the reference voltage was outside the DAC in the wrapper layer. Finding a real name for the wrapper layer was easy, I called it “circuit board”.

Domain analysis is all about having the right abstractions for your model. Your model is suitable for your task when everything fits and falls into place nearly automatically. When names needn’t be found but kind of obtrude themselves from the real domain. The right model (for the given task) feels good and transports a lot of domain knowledge. And domain knowledge is the most treasurable knowledge for any developer.

Object slicing – breaking polymorphic objects in C++

C++ has one pitfall called “object slicing” alien to most programmers using other object-oriented languages like Java. Object slicing (fruit ninja-style) occurs in various scenarios when copying an instance of a derived class to a variable with the type of (one of) its base class(es), e.g.:

#include <iostream>

// we use structs for brevity
struct Base
{
  Base() {}
  virtual void doSomething()
  {
    std::cout << "All your Base are belong to us!\n";
  }
};

struct Derived : public Base
{
  Derived() : Base() {}
  virtual void doSomething() override
  {
    std::cout << "I am derived!\n";
  }
};

static void performTask(Base b)
{
  b.doSomething();
}

int main()
{
  Derived derived;
  // here all evidence that derived was used to initialise base is lost
  performTask(derived); // will print "All your Base are belong to us!"
}

Many explanations of object slicing deal with the fact, that only parts of the fields of derived classes will be copied on assignment or if a polymorphic object is passed to a function by value. Usually this is ok because most of the time only the static type of the Base class is used further on. Of course you can construct scenarios where this becomes a problem.

I ran into the problem with virtual functions that are sliced off of polymorphic objects, too. That can be hard to track down if you are not aware of the issue. Sadly, I do not know of any compilers that issue warnings or errors when passing/copying polymorphic objects by value.

The fix is easy in most cases: Use naked pointers, smart pointers or references to pass your polymorphic objects around. But it can be really hard to track the issue down. So try to define conventions and coding styles that minimise the risk of sliced objects. Do not avoid using and passing values around just out of fear! Values provide many benefits in correctness and readability and even may improve performance when used with concrete classes.

Edit: Removed excess parameters in contruction of derived. Thx @LorToso for the comment and the hint at resharper C++!

Software development is code organization

The biggest problem in developing and maintaining software is understanding code. Software developers should get good training in crafting code which can be understood. To make sense of the mess that is code we need to organize it.

The biggest problem in developing and maintaining software is understanding code. Software developers should get good training in crafting code which can be understood. To make sense of the mess we need to organize it.

In 2000 Edsger Dijkstra wrote about our problems organizing and designing software:

I would therefore like to posit that computing’s central challenge, “How not to make a mess of it”, has not been met. On the contrary, most of our systems are much more complicated than can be considered healthy, and are too messy and chaotic to be used in comfort and confidence.

Our code bases get so big and complicated today that we cannot comprehend them all at once. Back in the days of UNIX technical constraints led to smaller code. But the computer is not the limiting factor anymore. We are. Our mind cannot comprehend what we create. Brian Kernighan wrote:

Everyone knows that debugging is twice as hard as writing a program in the first place. So if you’re as clever as you can be when you write it, how will you ever debug it?

Writing code that we (or other developers) can understand is crucial. But why do we fail?

Divide and lose

Usually the first argument when tackling code is to decouple it. Make it clean. Use clean code principles like DRY, SOLID, KISS, YAGNI and what other acronyms you know. These really help to decouple. But they are missing the point. They are the how not the why.
Take a look at your codebase and tell me where are the classes which constitute a subdomain or a specific feature? In which project or part do they live?
Normally you cannot. We only know how to divide code by technical aspects. But features and changes often come from the domain, not from the technology.
How can we understand our creations when we cannot understand its structure? Its architecture? How can we understand something we do not see.

But it does work

The next argument is not much better. Our code might work now. But what if a bug is found or a new feature is about to be implemented? Do you understand the code and its structure? Even weeks, months or years later? Working code is good but you can only change code reliably that you understand.

KISS

Write simple code. Write simple and small methods. Write cohesive classes. The dream of components. But the whole is more than the sum of its parts. You can write simple classes but the communication and threading issues between them can be very complex. Even if the interfaces are sound and simple. Understanding a simple class can be easy in isolation. But understanding a system of simple classes can be difficult and complex. Things are complex. Domains are complex. We cannot ignore that.

Code as an interface

When writing code we have to take the reader and the domain into account. Treat code as an interface. An interface to the system and the domain. It is an opinionated view of the world. The computer does not care about the code we use. Just like the printer who prints our favorite book. But the reader does.
This isn’t just nice thinking, understanding code is key to successfully crafting and changing software.

Assumptions – how to find, track and eliminate them

Assumptions can kill a project. Like a house built on sand we don’t know when and where it will collapse.
The problem with assumptions is that they disguise as truths. We believe them.

Assumptions can kill a project. Like a house built on sand we don’t know when and where it will collapse.
The problem with assumptions is that they disguise as truths. We believe them. They are the project’s reality. Just like the matrix.
Assumptions are shortcuts. Guesses at reality. We cannot fully grasp reality, so we assume. But we can find evidence for our decisions. For this we need to uncover the assumptions, assess their risk and gather evidence. But how do we know what we assume?

Find assumptions

Watch your language

‘I think’, ‘In my opinion’, ‘should be’, ‘roughly’, ‘circa’ are all clues for assumptions. Decisions need to be based on evidence. When we use vague language or personal opinions to describe our project we need to pause. Under this lurks insecurity and assumptions.
Another red flag are metaphors. Metaphors might be great to present, paint a picture in our head or describe a vision. But in decision making they are too abstract and meaningless. We may use them to describe our strategy but when we need to design and implement we need borders that constrain our decisions. Metaphors usually cover only some aspects of the project and vice versa. There’s a mismatch. We need concrete language without ambiguity.

Be dumb

We know so much that we think others have the same experience, education, view point, familiarity, proficiency and imprinting. We know so little that we think the other way is also true. We transfer. We assume. Dare to ask dumb questions. Adopt a beginner’s mind. Challenge traditions and common beliefs.
We take age old decisions for granted. They were made by people smarter than us, so they must be right. Don’t do this. Question them. Even the obvious ones.
In the book ‘Hidden in plain sight’ Jan Chipchase enters a typical cafe where people sit and talk, drink coffee and typing on their laptops. The question he poses: should the coffeshop owner sell diapers? So that everybody can continue what they do without the need to go to the bathroom. This question challenges our cultural and imprinted beliefs. And this is good.

Be curious

Ask: why? We need to get to the root of the problem. Dig deeper. Often under layers of reasoning and thoughtful decisions lies an assumption. The chain is only so strong as its weakest link. If we started with an assumption, the reasoning building on it is also assumed. Children often ask why and don’t stop even when we think it is all said and logical. So when we find the root, we need to continue to ask: is this really the root? Why is it the way it is.
Another question we need to ask repeatedly is: what if? What if: our target audience changes? we try to follow the opposite of the goals of our project? what if the technology changes?

Change perspectives

We see what we want to see. Seeing is an active process. We can stretch our thinking only so far. To stretch it even further we need to change roles. For just some hours do the work our users do. Feel their pains. Their highs and lows.
Or adopt the role of the browser. Good interfaces are conversations. Play a dialog with your user. Be the browser.
Only by embracing constraints of other perspectives we can force ourselves to stretch. In this way we find things which are assumed by us because of our view of the world.

Track them

After we have collected the assumptions we need to track them to later prove or disprove them. For this a simple spreadsheet or table is sufficient. This learning plan consists of 5 columns (taken from Leah Buley’s The UX Team of one):

  • the assumption: what we believe is true
  • the certainty: a 3 or 5 point scale showing how sure we are that we are right
  • notes: additional notes of why we think the assumption is right or wrong
  • the evidence: results which we collected to support this assumption
  • the research: things we can do to collect further evidence

Eliminate them

Now that we know what we assume and with which certainty we think we are right, we can start to collect further information to support or disprove our claims. In short: We research. Research can take many different forms. But all forms are there to gain further insights. Some basic forms we use to bring light into the darkness of uncertainty are:

  • Stakeholder interviews
  • (Contextual) user interviews
  • Heuristic evaluation
  • Prototyping
  • Market research

Other methods we don’t use (yet) include:

  • A/B tests (paired with analytics)
  • User tests

The point behind all these methods is to build a chain of reasoning. Everything in our software needs a reason to exist. The users and the stakeholders are the primary sources of insight. But also our experience, the human psychology and common patterns or conventions help us to decide which way to go.
Not only the method of collecting is important but also how the results are documented. We should present the essential information in a way that it is easy to get a glimpse of it just by looking at the respective documents. On the other side we should all keep this pragmatic and not go overboard. Our goal is to get insight and not build a proof of the system.

VB.NET for Java Developers – Updated Cheat Sheet

The BASIC programming language (originally invented at Dartmouth College in 1964) and Microsoft share a long history together. Microsoft basically started their business with the licensing of their BASIC interpreter (Altair BASIC), initially developed by Paul Allan and Bill Gates. Various dialects of Microsoft’s BASIC implementation were installed in the ROMs of many home computers like the Apple II (Applesoft BASIC) or the Commodore 64 (Commodore BASIC) during the 1970s and 1980s. A whole generation of programmers discovered their interest for computer programming through BASIC before moving on to greater knowledge.

BASIC was also shipped with Microsoft’s successful disk operating system (MS-DOS) for the IBM PC and compatibles. Early versions were BASICA and GW-BASIC. Original BASIC code was based on line numbers and typically lots of GOTO statements, resulting in what was often referred to as “spaghetti code”. Starting with MS-DOS 5.0 GW-BASIC was replaced by QBasic (a stripped down version of Microsoft QuickBasic). It was backwards compatible to GW-BASIC and introduced structured programming. Line numbers and GOTOs were no longer necessary.

When Windows became popular Microsoft introduced Visual Basic, which included a form designer for easy creation of GUI applications. They even released one version of Visual Basic for DOS, which allowed the creation of GUI-like textual user interfaces.

Visual Basic.NET

The current generation of Microsoft’s Basic is Visual Basic.NET. It’s the .NET based successor to Visual Basic 6.0, which is nowadays known as “Visual Basic Classic”.

Feature-wise VB.NET is mostly equivalent to C#, including full support for object-oriented programming, interfaces, generics, lambdas, operator overloading, custom value types, extension methods, LINQ and access to the full functionality of the .NET framework. The differences are mostly at the syntax level. It has almost nothing in common with the original BASIC anymore.

Updated Cheat Sheet for Java developers

A couple of years ago we published a VB.NET cheat sheet for Java developers on this blog. The cheat sheet uses Java as the reference language, because today Java is a lingua franca that is understood by most contemporary programmers. Now we present an updated version of this cheat sheet, which takes into account recent developments like Java 8:

The web is for documents

The web is intended to help a person find and understand relevant information. The primary container of information is the document. Therefore web applications should be centered around a document metaphor, not an app one.

The web is intended to help a person find and understand relevant information. The primary container of information is the document. Therefore web applications should be centered around a document metaphor, not an app one.

In 1990 Tim Berners-Lee and Robert Cailliau wrote a proposal for what we call the web today:

HyperText is a way to link and access information of various kinds as a web of nodes in which the user can browse at will.

The web is a linked information system. Bret Victor states:

Information software serves the human urge to learn. A person uses information software to construct and manipulate a model that is internal to the mind — a mental representation of information.

The web is built around information. More information than we can handle. What we need to make sense of it all is understanding. The power of technology can be used to transfer and gain understanding. Understanding needs to be a first class citizen. The applications we build must be centered around it.

One way to foster understanding is to interact, to play with information. Technology can simulate a system of information so that we can form hypotheses and ask questions. Bret Victor coined the term “explorable explanations” to describe such systems.

I believe the web is perfectly suited for building explorable explanations.

The web’s container for information is the document. A document combines different forms of media (text, images, video, …) to a whole. Fortunately for us the web does not stop here. With scripting we have the possibility to interact and manipulate the information in order to gain further insight.

Most of the tools we need to create for understanding are already at our hands. What we need is a fundamental change in focus. Right now (a large part of) the web industry tries to play catch up with native. Whole frameworks try to mimic native applications like this is a virtue. Current developments want to abstract the document as far away as possible. This is not what the web was intended for. Why build an application which tries so hard to recreate a native feeling in something other than the native platform itself? Web applications should be built on the strength of the web. We should not chase a foreign metaphor.
Right now the web seems to be torn. Torn between the print era of passive documents and the shiny new world of native applications. But the web has the capability to do so much more. To concentrate on its purpose, to fill the niche. A massive niche. Understanding is a core endeavor of mankind. To quote Stephen Anderson and Karl Fast in introducing their upcoming book From Information to Understanding:

In all areas of life, we are surrounded by understanding problems.

Doug Engelbart shares a similar vision for the purpose of the personal computer per se:

By “augmenting human intellect” we mean increasing the capability of a man to approach a complex problem situation, to gain comprehension to suit his particular needs, and to derive solutions to problems.

The web is ready. The tools are ready. But are we?

Where to start: foundations

Easy maintenance, not easy production

Imagine that you have to do critical changes in the source code, stressed, over-worked and under time pressure, without any recollection about your thoughts during development, lacking your familiar tools while your customer waits impatiently by your side. That’s when you are glad you prepared for this moment.

It is said that source code gets read a hundred times more often than it gets written. My experience confirms this circumstance, which leads to a principle of economic source code modifications: The first modification is almost for free, it’s the later ones that run up a bill. In order to achieve a low TCO (total cost of ownership), it’s sound advice to plan (and develop) for easy maintenance instead of easy production. This principle even has a fancy name: Keep It Simple, Stupid (KISS).

Origin of KISS

According to the english Wikipedia on the topic, the principle’s name was coined by an aircraft engineer named Kelly Johnson, who also designed the fastest jet plane of all time, the SR-71 “Blackbird”. The aircraft reached speeds of over Mach 3 and had an unmatched defensive armament: enough thrust to evade any confrontation. It would simply fly higher and/or faster than anything launched against it, like interceptor fighters or anti-air missiles. The Blackbird used construction material that was specifically invented for this plane and very expensive, so it definitely was no easy production. Sadly for this blog post, it wasn’t particularly easy to maintain, either. It usually leaked so much fuel that it had to be refueled directly after takeoff. The leaks were pragmatically designed that way and would seal themselves in flight. It’s quite an interesting plane.

Easy maintenance

Johnson once alledegly gave his engineers a bunch of tools and required that the aircraft under design must be repairable with only these tools, by an average mechanic in the field under combat conditions (e.g. stressed, exhausted and narrow timeframe). This is a wonderful concept that I regularly apply to my projects: Imagine that you are on-site with your customer, the most important functionality of your project just broke, resulting in a disastrous standstill of the whole facility and all you have is your sourcecode and the vi editor (or vim, if you’re non-hardcore like me). No internet, no IDE, no extensive documentation. Could you make meaningful changes to your project under these conditions? What needs to be changed to make your life easier in such an extreme situation? What restrictions would these changes impose on your daily work? How much effort/damage/resources would these changes cost? Is it easier to anticipate maintenance or to trust to luck that it won’t be necessary?

Easy production

A while ago, I reviewed the code of a map tool. The user viewed a map and could click on it to mark a certain location. The geo coordinates of this location would be used as an input for further computations. The map was restricted to a fixed area, so the developer wrote the code in the easiest way possible: He chose well-known geographic landmarks on the map and determined their geo coordinates and pixel locations. map-with-referenceThose were the reference points in the code that every click would be related to. Using easy mathematic (rule of three), each point on the map could be calculated. Clever trick and totally working! The code practically wrote itself and the reference points only needed to be determined once.

Until the map was changed. The code contained some obscure constants that described the original landmarks, but the whole concept of arbitrary reference points was alien to the next developer. He was used to the classic concept of two reference points: top left and bottom right, with their respective geo coordinates and pixel locations. What seemed like a quick task turned into a stressful reclaiming of the clever trick during production.

In this example, the production (initial development) was easy, straight-forward and natural, but not easily reproducible during the maintenance phase (subsequent modification). The algorithm used a clever approach, but this cleverness isn’t necessarily available “under combat conditions”.

Go the extra mile

Most machines are designed so that wearing parts can be easily replaced. Think of batteries in electronic gagdets (well, at least before the gadget’s estimated lifetime was lower than the average battery life) or light bulbs in cars (well, at least before LED headlights were considered cool). Thing is, engineers usually know the wear effects their designs have to endure. There is no “usual wear effect” on software, due to lack of natural forces like gravitation. Everything could change, so it’s better to be prepared for all sorts of change. That’s the theory, but it’s not economically sound to develop software that is prepared for any circumstance. Pragmatic reasoning calls for compromise, like supporting changes that

  • are likely to happen (this needs to be grounded in domain knowledge and should be documented in the code)
  • are not expensive to arrange beforehands (the popular “low-hanging fruits”)
  • are expensive to implement afterwards

The last aspect might be a bit of a surprise, but it’s the exact aspect that came to play in the example above: To recreate the knowledge about the clever trick of landmark choices needed more time than implementing the classic interpolation taking the edge points.

So if you see a possible (and not totally unlikely) change that will be expensive to implement once the intimate knowledge of the code you have during the initial development, go the extra mile and prepare for it right now. Leave a comment, implement some kind of extension mechanism or just mind the code seams. In our example, slightly complicating the initial development led to a dramatically less clever and more accessible code.

Conclusion

You should consider writing your code for easy maintenance, even if that means additional effort during the initial implementation. Imagine that you yourself have to do the future change, stressed, over-worked and under time pressure, without any recollection about your thoughts today, lacking your familiar tools while your customer waits impatiently by your side. With proper preparation, even this scenario is feasible.

 

 

Creating a GPS network service using a Raspberry Pi – Part 2

In the last article we learnt how to install and access a GPS module in a Raspberry Pi. Next, we want to write a network service that extracts the current location data – latitude, longitude and altitude – from the serial port.

Basics

We use Perl to write a CGI-script running within an Apache 2; both should be installed on the Raspberry Pi. To access the serial port from Perl, we need to include the module Device::SerialPort. Besides, we use the module JSON to generate the HTTP response.

use strict;
use warnings;
use Device::SerialPort;
use JSON;
use CGI::Carp qw(fatalsToBrowser);

Interacting with the serial port

To interact with the serial port in Perl, we instantiate Device::SerialPort and configure it according to our hardware. Then, we can read the data sent by our hardware via device->read(…), for example as follows:

my $device = Device::SerialPort->new('...') or die "Can't open serial port!";
//configuration
...
($count, $result) = $device->read(255);

For the Sparqee GPSv1.0 module, the device can be configured as shown below:

our $device = '/dev/ttyAMA0';
our $baudrate = 9600;

sub GetGPSDevice {
 my $gps = Device::SerialPort->new($device) or return (1, "Can't open serial port '$device'!");
    $gps->baudrate($baudrate);
    $gps->parity('none');
    $gps->databits(8);
    $gps->stopbits(1);
    $gps->write_settings or return (1, 'Could not write settings for serial port device!');
    return (0, $gps);
}

Finding the location line

As described in the previous blog post, the GPS module sends a continuous stream of GPS data; here is an explanation for the single components.

$GPGSA,A,3,17,09,28,08,26,07,15,,,,,,2.4,1.4,1.9*.6,1.7,2.0*3C
$GPRMC,031349.000,A,3355.3471,N,11751.7128,W,0.00,143.39,210314,,,A*76
$GPGGA,031350.000,3355.3471,N,11751.7128,W,1,06,1.7,112.2,M,-33.7,M,,0000*6F
$GPGSA,A,3,17,09,28,08,07,15,,,,,,,2.6,1.7,2.0*3C
$GPGSV,3,1,12,17,67,201,30,09,62,112,28,28,57,022,21,08,55,104,20*7E
$GPGSV,3,2,12,07,25,124,22,15,24,302,30,11,17,052,26,26,49,262,05*73
$GPGSV,3,3,12,30,51,112,31,57,31,122,,01,24,073,,04,05,176,*7E
$GPRMC,031350.000,A,3355.3471,N,11741.7128,W,0.00,143.39,210314,,,A*7E
$GPGGA,031351.000,3355.3471,N,11741.7128,W,1,07,1.4,112.2,M,-33.7,M,,0000*6C

We are only interested in the information about latitude, longitude and altitude, which is part of the line starting with $GPGGA. Assuming that the first parameter contains a correctly configured device, the following subroutine reads the data stream sent by the GPS module, extracts the relevant line and returns it. In detail, it searches for the string $GPGGA in the data stream, buffers all data sent afterwards until the next line starts, and returns the buffer content.

# timeout in seconds
our $timeout = 10;

sub ExtractLocationLine {
    my $gps = $_[0];
    my $count;
    my $result;
    my $buffering = 0;
    my $buffer = '';
    my $limit = time + $timeout;
    while (1) {
        if (time >= $limit) {
           return '';
        }
        ($count, $result) = $gps->read(255);
        if ($count <= 0) {
            next;
        }
        if ($result =~ /^\$GPGGA/) {
            $buffering = 1;
        }
        if ($buffering) {
            my $part = (split /\n/, $result)[0];
            $buffer .= $part;
        }
        if ($buffering and ($result =~ m/\n/g)) {
            return $buffer;
        }
    }
}

Parsing the location line

The $GPGGA-line contains more information than we need. With regular expressions, we can extract the relevant data: $1 is the latitude, $2 is the longitude and $3 is the altitude.

sub ExtractGPSData {
    $_[0] =~ m/\$GPGGA,\d+\.\d+,(\d+\.\d+,[NS]),(\d+\.\d+,[WE]),\d,\d+,\d+\.\d+,(\d+\.\d+,M),.*/;
    return ($1, $2, $3);
}

Putting everything together

Finally, we convert the found data to JSON and print it to the standard output stream in order to write the HTTP response of the CGI script.

sub GetGPSData {
    my ($error, $gps) = GetGPSDevice;
    if ($error) {
        return ToError($gps);
    }
    my $location = ExtractLocationLine($gps);
    if (not $location) {
        return ToError("Timeout: Could not obtain GPS data within $timeout seconds.");
    }
    my ($latitude, $longitude, $altitude) = ExtractGPSData($location);
    if (not ($latitude and $longitude and $altitude)) {
        return ToError("Error extracting GPS data, maybe no lock attained?\n$location");
    }
    return to_json({
        'latitude' => $latitude,
        'longitude' => $longitude,
        'altitude' => $altitude
    });
}

sub ToError {
    return to_json({'error' => $_[0]});
}

binmode(STDOUT, ":utf8");
print "Content-type: application/json; charset=utf-8\n\n".GetGPSData."\n";

Configuration

To execute the Perl script with a HTTP request, we have to place it in the cgi-bin directory; in our case we saved the file at /usr/lib/cgi-bin/gps.pl. Before accessing it, you can ensure that the Apache is configured correctly by checking the file /etc/apache2/sites-available/default; it should contain the following section:

ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/
<Directory "/usr/lib/cgi-bin">
    AllowOverride None
    Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch
    Order allow,deny
    Allow from all
</Directory>

Furthermore, the permissions of the script file have to be adjusted, otherwise the Apache user will not be able to execute it:

sudo chown www-data:www-data /usr/lib/cgi-bin/gps.pl
sudo chmod 0755 /usr/lib/cgi-bin/gps.pl

We also have to add the Apache user to the user group dialout, otherwise it cannot read from the serial port. For this change to come into effect the Raspberry Pi has to be rebooted.

sudo adduser www-data dialout
sudo reboot

Finally, we can check if the script is working by accessing the page <IP address>/cgi-bin/gps.pl. If the Raspberry Pi has no GPS reception, you should see the following output:

{"error":"Error extracting GPS data, maybe no lock attained?\n$GPGGA,121330.326,,,,,0,00,,,M,0.0,M,,0000*53\r"}

When the Raspberry Pi receives GPS data, they should be given in the browser:

{"longitude":"11741.7128,W","latitude":"3355.3471,N","altitude":"112.2,M"}

Last, if you see the following message, you should check whether the Apache user was correctly added to the group dialout.

{"error":"Can't open serial port '/dev/ttyAMA0'!"}

Conclusion

In the last article, we focused on the hardware and its installation. In this part, we learnt how to access the serial port via Perl, wrote a CGI script that extracts and delivers the location information and used the Apache web server to make the data available via network.

Declaration-site and use-site variance explained

A common question posed by programming novices who have their first encounters with parametrized types (“generics” in Java and C#) is “Why can’t I use a List<Apple> as a List<Fruit>?” (given that Apple is a subclass of Fruit) Their reasoning usually goes like this: “An apple is a fruit, so a basket of apples is a fruit basket, right?”

Here’s another, similar, example:

Milk is a dairy product, but is a bottle of milk a dairy product bottle? Try putting a Cheddar cheese wheel into the milk bottle (without melting or shredding the cheese!). It’s obviously not that simple.

Let’s assume for a moment that it was possible to use a List<Apple> as a List<Fruit>. Then the following code would be legal, given that Orange is a subclass of Fruit as well:

List<Apple> apples = new ArrayList<>();
List<Fruit> fruits = apples;
fruits.add(new Orange());

// what's an orange doing here?!
Apple apple = apples.get(0);

This short code example demonstrates why it doesn’t make sense to treat a List<Apple> as a List<Fruit>. That’s why generic types in Java and C# don’t allow this kind of assignment by default. This behaviour is called invariance.

Variance of generic types

There are, however, other cases of generic types where assignments like this actually could make sense. For example, using an Iterable<Apple> as an Iterable<Fruit> is a reasonable wish. The opposite direction within the inheritance hierarchy of the type parameter is thinkable as well, e.g. using a Comparable<Fruit> as a Comparable<Apple>.

So what’s the difference between these generic types: List<T>, Iterable<T>, Comparable<T>? The difference is the “flow” direction of objects of type T in their interface:

  1. If a generic interface has only methods that return objects of type T, but don’t consume objects of type T, then assignment from a variable of Type<B> to a variable of Type<A> can make sense. This is called covariance. Examples are: Iterable<T>, Iterator<T>, Supplier<T>inheritance
  2. If a generic interface has only methods that consume objects of type T, but don’t return objects of type T, then assignment from a variable of Type<A> to a variable of Type<B> can make sense. This is called contravariance. Examples are: Comparable<T>, Consumer<T>
  3. If a generic interface has both methods that return and methods that consume objects of type T then it should be invariant. Examples are: List<T>, Set<T>

As mentioned before, neither Java nor C# allow covariance or contravariance for generic types by default. They’re invariant by default. But there are ways and means in both languages to achieve co- and contravariance.

Declaration-site variance

In C# you can use the in and out keywords on a type parameter to indicate variance:

interface IProducer<out T> // Covariant
{
    T produce();
}

interface IConsumer<in T> // Contravariant
{
    void consume(T t);
}

IProducer<B> producerOfB = /*...*/;
IProducer<A> producerOfA = producerOfB;  // now legal
// producerOfB = producerOfA;  // still illegal

IConsumer<A> consumerOfA = /*...*/;
IConsumer<B> consumerOfB = consumerOfA;  // now legal
// consumerOfA = consumerOfB;  // still illegal

This annotation style is called declaration-site variance, because the type parameter is annotated where the generic type is declared.

Use-site variance

In Java you can express co- and contravariance with wildcards like <? extends A> and <? super B>.

Producer<B> producerOfB = /*...*/;
Producer<? extends A> producerOfA = producerOfB; // legal
A a = producerOfA.produce();
// producerOfB = producerOfA; // still illegal

Consumer<A> consumerOfA = /*...*/;
Consumer<? super B> consumerOfB = consumerOfA; // legal
consumerOfB.consume(new B());
// consumerOfA = consumerOfB; // still illegal

This is called use-site variance, because the annotation is not placed where the type is declared, but where the type is used.

Arrays

The variance behaviour of Java and C# arrays is different from the variance behaviour of their generics. Arrays are covariant, not invariant, even though a T[] has the same problem as a List<T> would have if it was covariant:

Apple[] apples = new Apple[10];
Fruit[] fruits = apples;
fruits[0] = new Orange();
Apple apple = apples[0];

Unfortunately, this code compiles in both languages. However, it throws an exception at runtime (ArrayStoreException or ArrayTypeMismatchException, respectively) in line 3.