A simple yet useful project metric

If you are a project manager, this little metric might help you to quickly categorize your projects and learn about their personality.

In my years of managing software development projects, I’ve come to apply a simple metric to each project to determine its “personality”. The metric consists of only two aspects (or dimensions): success and noise. Each project strives to be successful in its own terms and each project produces a certain amount of “noise” while doing so. Noise, in my definition, is necessary communication above the minimum. A perfectly silent project isn’t really silent, there are just no communicated problems. That doesn’t mean there aren’t any problems! A project team can silently overcome numerous problems on their own and still be successful. The same team can cry for help at each and any hurdle and still fail in the long run. That would mean a lot of noise without effect. I call such a project a “Burning Ox”.

Success vs. Noise

metric

As you can see, there are four types of projects with this metric. The desired type of project is the “White Knight” in the top-left quarter, while the “Burning Ox” in the bottom-right is the exact opposite. Let’s review all four types:

  • White Knight (silently on track): A project that is on track, tackles every upcoming challenge on its own, reports its status but omits the details and turns out to be a success is the dream of every project manager. You can let the team find its own way, document their progression and work on the long-term goals for the team and the product. It’s like sailing in quiet waters on a sunny day. Nothing to worry about and a pleasant experience all around.
  • Drama Queen (loud, but on track): This project is ultimately headed towards success, but every obstacle along the way results in emergency meetings, telephone conferences or e-mail exchanges. The number of challenges alone indicate that the team isn’t up to the task. You are tempted to micro-manage the project, to intervene to solve the problems and ensure success or at least progress. But you are bound to recognize some or even most problems as non-existent. The key sentence to say or think is: “Strange, nobody else ever had this problem and we’ve done it a dozen times before”. If you are a manager for several projects, the Drama Queens in your portfolio will require the majority of your time and attention. You’ll be glad when the project is over and “peaceful” times lie ahead.
  • Backstabber (silent and a failure): This is the biggest fear of every manager. The project seems alright, the team doesn’t report any problems and everything looks good. But when the cards need to be put on the table, you end up with a weak combination. It’s too late to do anything about the situation, the project is a failure. And it failed because you as the manager didn’t dig deeper, because you let them fool you. No! If you look closer, it failed because nobody dug deeper and everybody was in denial. You’ll see the warning signs in retrospective. You will become more paranoid in your next project. You’ll lose faith in the project status reports of your teams. You’ll inquire more and micro-manage the communication. You’ll become a skittish manager because of this unpleasant experience. Backstabber projects have horrendous costs for the social structure of a company.
  • Burning Ox (loud and failing): The name stems from an ancient war tactics when the enemy’s camp was overrun by a horde of oxen with burning torches bound to the horns. The panicked animals wreaked havoc along their way and started fires left and right. A Burning Ox is helpless in the situation, but takes it out on anybody and anything near it, too. This project is bound to fail, the team is in it way over their heads and no amount of support from your side or help from the outside can safe it. Well, experienced firefighters might work wonders, but they are expensive and rare (we know because we are often called in for this job). If you find a Burning Ox in your project portfolio (and you will know it, because a Burning Ox screams on the top of his lungs), prepare yourself for the inevitable: The project will fail, in scope (missing functionality), budget (higher costs) and/or time (delayed delivery). You better start with damage control now or make a call to a firefighter you can trust.

Easy assessment

This project management metric is not meant for deep inspection, but for easy assessment and quick communication. You can convey your desired communication style and the fact that everybody involved with the project is partly responsible for its success or failure. The metric states that too much detail is not helpful and too little detail can be disastrous. It also shows that loudly failing projects are not the fault of the project team alone (the ox cannot help being used as a living torch), but that the prerequisites of the project weren’t met.

Takeaway

If you are not a project manager, what can you learn from this blog post? Ask yourself if you require too much help from your manager, forcing him/her to switch into the micro-management gear, even if you could solve the problem yourself. If you cannot, ask yourself if you think that you can deliver the project in scope, time and budget or if you already smell the fire. If you can smell the fire, is your manager aware? Are you telling him/her in unclouded words about your perceived state of the project? Did you attempt to communicate your perception/feeling at least twice? If not, your manager might be shocked that he/she took care of a Backstabber project. A failing project is not your fault! You would only be to blame for the continued hiding of a known fact.

If you are a project manager, take a piece of paper, draw the metric’s chart and try to pin-point the position of all your projects. Be as honest and exact as possible. Is it really a Burning Ox or “just” a Drama Queen? Are your White Knights really above reproach or is their loyality questionable? What questions could you ask to try to unveil hidden problems, even those that nobody is aware of yet?

These quick, repeated assessments help me to manage my schedule and not forget about the silent projects because the loud projects always ellbow their way into my attention.

Recap of the Schneide Dev Brunch 2016-12-11

If you couldn’t attend the Schneide Dev Brunch at 11th of December 2016, here is a summary of the main topics.

brunch64-borderedLast week at sunday, we held another Schneide Dev Brunch, a regular brunch on the second sunday of every other (even) month, only that all attendees want to talk about software development and various other topics. This brunch was so well-attended that we had to cramp around our conference table and gather all chairs on the floor. As usual, the main theme was that if you bring a software-related topic along with your food, everyone has something to share. Because we were so many, we established a topic list and an agenda for the event. As usual, a lot of topics and chatter were exchanged. This recapitulation tries to highlight the main topics of the brunch, but cannot reiterate everything that was spoken. If you were there, you probably find this list inconclusive:

Finland

We started with a report of one of our attendees who had studied in Finland for the last two years. He visited the Aalto university and shared a lot of cultural details about Finland and the Finnish people with us.

The two most important aspects of the report were sauna and singing. The Finnish love to visit a sauna, in fact, nearly every building has a functioning sauna. Every office building has a company sauna that will get visited often. So it might happen that your first visit of a company starts right in the sauna, naked with the bosses.

And the Finnish love singing so much that they usually start singing during the sauna session. There are open social events organized around singing together.

Alcohol plays a big role in Finland, mostly because the taxes makes it incredibly expensive to obtain a proper buzz. In the southern regions, much alcohol is imported from Russia or Estonia by ferry. There are even special ferry routes designed to be cost-neutral when shopping for alcohol. But alcohol isn’t the only thing that is made expensive with special taxes. Sugar and sugary food/drinks are heavily taxed, too. So it’s actually more expensive to eat unhealthy, which sounds like a good concept to counter some civilizational diseases.

The Finnish students often wear a special boilersuit during official events that identifies their affilition with their field of study and university. They apply patches and stickers to their suit when they have completed certain tasks or chores. It’s actually a lot like a military uniform with rank and campaign insignia. Only that the Finnish student boilersuit may not be cleaned or washed other than jumping into a body of water with you in it. And the Finnish lakes are frozen most of the year, with temperatures of -27 °C being nothing extraordinary.

As you probably have guessed right now, costs for rent and electricity are high. Our attendee enjoyed his time there, but is also glad to have the singing separated from the alcohol for the most part.

Lambdas and Concurrency

The next question revolved around the correlation between lambda expressions and concurrent execution of source code. The Vert.x framework relies heavily on lambdas and provides reactive programming patterns for Java. As such, it is event driven and non blocking. That makes it hard to debug or to reason about the backstory if an effect occurs in production. The traditional tools like stacktraces don’t tell the story anymore.

We took a deep dive into the concepts behind Optionals, Promises and Futures (but forgot to talk about the Expected type in C++). There is a lot of foggy implementation details in the different programming languages around these concepts and it doesn’t help that the Java Optional tries to be more than the C++ Optional, but doesn’t muster up the courage to be a full Monad. Whether deprecating the get()-method will make things better is open for discussion.

To give a short answer to a long discussion: Lambdas facilitate concurrent programming, but don’t require or imply it.

React.js and Tests

It was only a small step from the reactive framework Vert.x to the React.js framework in Javascript. One attendee reported his experiences with using different types of tests with the React framework. He also described the origin of the framework, mentioning the concept of Flux and Redux along the way.

Sorry if I’m being vague, but each written sentence about Javascript frameworks seem to have a halflife time of about six weeks. My take on the Javascript world is to lean back, grab some popcorn and watch the carnival from the terrace, because while we’re stuck with it forever, it is tragically unfortunate. Even presumed simple things like writing a correct parser for JSON end in nightmares.

It should be noted, though, that the vue.js framework entered the “assess” stage of the Thoughtworks Techradar, while AngularJS (or just Angular, as it should be called now) is in the “hold” stage.

Code Analysis

We also talked about source code analysis tools and plugins for the IDE. The gist of it seems to be that the products of JetBrains (especially the IntelliJ IDEA IDE) have all the good things readily included, while there are standalone products or plugins for other IDEs.

Epilogue

As usual, the Dev Brunch contained a lot more chatter and talk than listed here. The number of attendees makes for an unique experience every time. We are looking forward to the next Dev Brunch at the Softwareschneiderei in February 2017. We even have some topics already on the agenda (like a report about first-hand experiences with the programming language Rust). And as always, we are open for guests and future regulars. Just drop us a notice and we’ll invite you over next time.

Let’s talk about C++

It’s almost time for the holidays again. A time to reminisce. A time for family. A time for community.

Us software developers seem like an odd folk. We spend endless hours tinkering with our machines and gadgets. It appears like a lonely profession to outsiders. And it can be. Sometimes we have to get in The Zone to solve our tasks and problems. Other times we need to have sword fights. But sometimes we just have to meet other developers.

I’m not talking about your 10 o’clock daily standup or agile flavor-of-the-month meeting with other departments. Those are great. But sometimes it just has to be us programmers, as tech people.

Let’s talk about cool and tricky algorithms. Let’s talk about the latest and greatest language features that make all code some much cooler. Let’s talk which editor is the greatest. All the technical details.
It’s not necessarily the most important and essential aspect of our craft, no. But it’s kind of like the seasoning to a well cooked meal. It’s flavor and character. It’s fun.

I’m the C++ guy. It’s not the only one of my specialties, but kind of what I got a bit of a reputation for. And I like to talk about it. So far, this was either limited to colleagues and friends or “out there” on IRC, stackoverflow or other online communities. But I want to extend that and be a more active member of the local community.
David Farago had the great idea to create a platform for this in Karlsruhe: The C++ User Group Karlsruhe. He asked me to kindly extend an invitation. The kick-off is next month, right at the start of the new year, on the 11th of January, with one meeting scheduled every month. I think this is a perfect time to do this. C++ is in a great place right now. The language is evolving in a very positive way and the ecosystem is looking better and better.
So if you’re in any way interested meeting other local C++ people, please join us. I’m very much looking forward to meeting you guys!

Displaying numbers in tables

Many software applications have to display series of numbers, for example statistical information, measurement values or financial data. Of course there are many ways to visualize values graphically with charts, but sometimes the user wants to see the actual values as numbers. The typical layout method to display numbers are tables.

Here are some guidelines you should follow when you have to display numbers in a table.

Integer numbers

Right aligned integer numbers
Right-aligned integer numbers

Integer numbers that are shown in a table column should be right-aligned, because the orders of magnitude of a number’s digits increase from right to left. Additionally you should choose a font with fixed-width digits for numbers. This ensures that digits with the same orders of magnitude line up. Thus the numbers can be compared more easily. The font itself doesn’t have to be a fixed-width font in general. Some proportional fonts with variable widths for letters have fixed-widths for digits, called tabular figures.

Non-integer numbers

Aligned with decimal points
Aligned with decimal points

Non-integer numbers with decimal points should be aligned with their decimal points. The reason is the same as above: digits with the same orders of magnitude should line up. This can be a bit more effort to implement in your application than mere right-alignment, because components such as UI widgets or HTML tables usually don’t directly support this form of alignment.

However, you can implement it by using a font with tabular figures and then right-pad the numbers with spaces. Each of these spaces must have the same width as a digit, of course. This is the case with a fixed-width font, but there is also a special Unicode character for this purpose that can be used with proportional fonts and tabular figures: it’s called figure space and has the Unicode code point U+2007.

Modern developer Issue 4: My SQL toolbox

Thinking in SQL

SQL is such a basic and useful language but the underlying thinking is non-intuitive when you come from imperative languages like Java, Ruby and similar.
SQL is centered around sets and operations on them. The straight forward solution might not be the best one.

Limit

Let’s say we need the maximum value in a certain set. Easy:

select max(value) from table

But what if we need the row with the maximum value? Just adding the other columns won’t work since aggregations only work with other aggregations and group bys. Joining with the same table may be straight forward but better is to not do any joins:

select * from (select * from table order by value desc) where rownum<=1

Group by and having

Even duplicate values can be found without joining:

select value from table group by value having count(*) > 1

Grouping is a powerful operation in SQL land:

select max(value), TO_CHAR(time, 'YYYY-MM') from table group by TO_CHAR(time, 'YYYY-MM')

Finding us the maximum value in each month.

Mapping with outer joins

SQL is also good for calculations. Say we have one table with values and one with a mapping like a precalculated log table. Joining both gets the log of each of your values:

select t.value, log.y from table t left outer join log_table log on t.value=log.x

Simple calculations

We can even use a linear interpolation between two values. Say we have only the function values stored for integers but we values between them and these values between them can be interpolated linearly.

select t.value, (t.value-floor(t.value))*f.y + (ceil(t.value)-t.value)*g.y from table t left outer join function_table f on floor(t.value)=f.x left outer join function_table g on ceil(t.value)=g.x

When you need to calculate for large sets of values and insert them into another table it might be better to calculate in SQL and insert in one step without all the conversion and wrapping stuff present in programming languages.

Conditions

Another often overlooked feature is to use a condition:

select case when MOD(t.value, 2) = 0 then 'divisible by 2' else 'not divisible by 2' end from table t

These handful operations are my basic toolbox when working with SQL, almost all queries I need can be formulated with them.

Dates and timestamps

One last reminder: when you work with time always specify the wanted time zone in your query.

It’s only Cores and Caches but I like it

The programming game changed dramatically in the past ten years. We are playing CPU cores and caches now, but without proper visibility.

759px-amd_am5x86_dieMost of our software development economy is based on a simple promise: The computing power (or “performance”) of a common computer will double every two years. This promise accompanied us for 40 years now, a time during which our computers got monitors, acquired harddisks and provided RAM beyond the 640 kB that was enough for nobody. In the more recent years, we don’t operate systems with one CPU, but four, eight or even twelve of them. So it came as a great irritation when ten years ago, Herb Sutter predicted that “The free lunch is over” and even Gordon Moore, the originator of Moore’s Law that forms the basis of our simple promise said that it will only hold true for ten to fiveteen more years. Or, in other words, until today.

Irritation

That’s a bit unsettling, to say the least, and should be motivation enough to have a good look at everything we are doing. Intel, the biggest manufacturer of CPUs for computers, has indicated earlier this year that Moore’s Law cannot be fulfilled any longer. So, the free lunch is really over. And it turns out to have some hidden costs. One cost is a certain complacency, the conviction that things will continue to be as they were and that coding styles chiseled over years and decades hold an inherent value of experience.

Complacency

Don’t get me wrong – there is great value in experience, but not all knowledge of the past is helpful for the future. Sometimes, fundamental things change. Just as the tables will eventually turn for every optimization trick, we need to reevaluate some axioms of our stance towards performance. Let me reiterate some common knowledge:

There are two types of performance inherently baked into your source code: Theoretical and practical performance.

Performance

The first type is theoretical performance, measured in O(n), O(n²) or even O(n!) and mostly influenced by the complexity class of the algorithm you are using. It will translate into runtime behaviour (like in the case of O(n!) your software is already dead, you just don’t know yet), but isn’t concerned with the details of your implementation. Not using an unnecessary high complexity class for a given problem will continue to be a valueable skill that every developer should master.

On the other hand, practical performance is measured in milliseconds (or nanoseconds if you are into micro-benchmarks and can pull off to measure them correctly) and can heavily depend on just a few lines in your source code. Practical performance is the observable runtime behaviour of your software on a given hardware. There are two subtypes of practical performance:

  • Throughput (How many operations are computed by the system in a given unit of time?)
  • Latency (How long does it take one operation to be computed by the system?)

If you run a service, throughput is your main metric for performance. If you use a service, latency is your main concern. Let me explain this by the metapher of a breakfast egg. If you want to eat your breakfast at a hotel buffet and the eggs are empty, your main concern is how fast you will get your freshly boiled egg (latency). But if you run the hotel kitchen, you probably want to cook a lot of eggs at once (throughput), even if that means that one particular egg might boil slightly longer as if you’d boiled each of them individually.

Latency

Those two subtypes are not entirely independent from each other. But the main concern for most performance based work done by developers is latency. It is relatively easy to measure and to reason about. If you work with latency-based performance issues, you should know about the latency numbers every programmer should know, either in visual form or translated to a more human time scale. Lets iterate some of the numbers and their scaled counterpart here:

  • 1 CPU cycle (0.3 ns): 1 second
  • Level 1 cache access (0.9 ns): 3 seconds
  • Branch mispredict (2.5 ns): 8 seconds
  • Level 2 cache access (2.8 ns): 9 seconds
  • Level 3 cache access (12.9 ns): 43 seconds
  • Main memory access (120 ns): 6 minutes
  • Solid-state disk I/O (50-150 μs): 2-6 days
  • Rotational disk I/O (1-10 ms): 1-12 months

We can discuss any number in detail, but the overall message stands out nonetheless: CPUs are lightning fast and caches are the only system components that can somewhat keep up. As soon as your program hits the RAM, your peak performance is lost. This brings us to the main concept of latency optimization:

Your program’s latency is ultimately decided by your ability to decrease cache misses.

You can save CPU cycles by performing clever hacks, but if you are able to always read your data (and code) from the cache, you’ll be 360 times faster than if your program constantly has to read from RAM. Your source code doesn’t have to change at all for this to happen. A good compiler and/or optimizing runtime can work wonders if you adhere to your programming language’s memory model. In reality, you probably have to rearrange your instructions and align your data structures. That’s the performance optimization of today, not the old cycle stinting. The big challenge is that none of these aspects are visible on the source code level of your program. We have to develop our programs kind of blindfolded currently.

Concurrency

One way how we’ve held up Moore’s Law in the last ten years was the introduction of multiprocessor computing into normal computers. If you cram two CPUs onto the die, the number of transistors on it has doubled. A single-threaded program doesn’t run any faster, though. So we need to look at concurrent programming to unlock the full power of our systems. Basically, there are two types of concurrent programming, deliberate and mechanical.

  • Deliberate concurrent programming means that you as the developer actively introduce threads, fibers or similar concepts into your source code to control parallel computation.
  • Mechanical concurrent programming means that your source code can be parallelized by the compiler and/or runtime and/or the hardware (e.g. hyper-threading) without changing the correctness of your program.

In both types of concurrent programming, you need to be aware about the constraints and limitations of correct concurrency. It doesn’t matter if your program is blazingly fast and utilizes all cores if the result is wrong or only occasionally correct. Once again, the memory model of your programming language is a useful set of rules and abstractions to guide you. Most higher-level concurrency models like actors narrow your possibilities even further, with functional programming being one of the strictest (and most powerful ones).

In the field of software development, we are theoretically well-prepared to take on the task of pervasive concurrent programming. But we need to forget about the good old times of single-core confirmability and embrace the chaotic world of raw computing power, the world of cores and caches.

Cores ‘n’ Caches

This is our live now: We rely on Cores ‘n’ Caches to feed us performance, but the lunch isn’t free anymore. We have to adapt and adjust, to rethink our core axioms and let go of those parts of our experience that are now hindrances. Just like Rock ‘n’ Roll changed the rules in the music business, our new motto changes ours.

Let’s rock.

Why I’m not using C++ unnamed namespaces anymore

Well okay, actually I’m still using them, but I thought the absolute would make for a better headline. But I do not use them nearly as much as I used to. Almost exactly a year ago, I even described them as an integral part of my unit design. Nowadays, most units I write do not have an unnamed namespace at all.

What’s so great about unnamed namespaces?

Back when I still used them, my code would usually evolve gradually through a few different “stages of visibility”. The first of these stages was the unnamed-namespace. Later stages would either be a free-function or a private/public member-function.

Lets say I identify a bit of code that I could reuse. I refactor it into a separate function. Since that bit of code is only used in that compile unit, it makes sense to put this function into an unnamed namespace that is only visible in the implementation of that unit.

Okay great, now we have reusability within this one compile unit, and we didn’t even have to recompile any of the units clients. Also, we can just “Hack away” on this code. It’s very local and exists solely to provide for our implementation needs. We can cobble it together without worrying that anyone else might ever have to use it.

This all feels pretty great at first. You are writing smaller functions and classes after all.

Whole class hierarchies are defined this way. Invisible to all but yourself. Protected and sheltered from the ugly world of external clients.

What’s so bad about unnamed namespaces?

However, there are two sides to this coin. Over time, one of two things usually happens:

1. The code is never needed again outside of the unit. Forgotten by all but the compiler, it exists happily in its seclusion.
2. The code is needed elsewhere.

Guess which one happens more often. The code is needed elsewhere. After all, that is usually the reason we refactored it into a function in the first place. Its reusability. When this is the case, one of these scenarios usually happes:

1. People forgot about it, and solve the problem again.
2. People never learned about it, and solve the problem again.
3. People know about it, and copy-and-paste the code to solve their problem.
4. People know about it and make the function more widely available to call it directly.

Except for the last, that’s a pretty grim outlook. The first two cases are usually the result of the bad discoverability. If you haven’t worked with that code extensively, it is pretty certain that you do not even know that is exists.

The third is often a consequence of the fact that this function was not initially written for reuse. This can mean that it cannot be called from the outside because it cannot be accessed. But often, there’s some small dependency to the exact place where it’s defined. People came to this function because they want to solve another problem, not to figure out how to make this function visible to them. Call it lazyness or pragmatism, but they now have a case for just copying it. It happens and shouldn’t be incentivised.

A Bug? In my code?

Now imagine you don’t care much about such noble long term code quality concerns as code duplication. After all, deduplication just increases coupling, right?

But you do care about satisfied customers, possibly because your job depends on it. One of your customers provides you with a crash dump and the stacktrace clearly points to your hidden and protected function. Since you’re a good developer, you decide to reproduce the crash in a unit test.

Only that does not work. The function is not accessible to your test. You first need to refactor the code to actually make it testable. That’s a terrible situation to be in.

What to do instead.

There’s really only two choices. Either make it a public function of your unit immediatly, or move it to another unit.

For functional units, its usually not a problem to just make them public. At least as long as the function does not access any global data.

For class units, there is a decision to make, but it is simple. Will using preserve all class invariants? If so, you can move it or make it a public function. But if not, you absolutely should move it to another unit. Often, this actually helps with deciding for what to create a new class!

Note that private and protected functions suffer many of the same drawbacks as functions in unnamed-namespaces. Sometimes, either of these options is a valid shortcut. But if you can, please, avoid them.

Arbitrary 2D curves with Highcharts

Highcharts is a versatile JavaScript charting library for the web. The library supports all kinds of charts: scatter plots, line charts, area chart, bar charts, pie charts and more.

A data series of type line or spline connects the points in the order of the x-axis when rendered. It is possible to invert the axes by setting the property inverted: true in the chart options.

var chart = $('#chart').highcharts({
  chart: {
    inverted: true
  },
  series: [{
    name: 'Example',
    type: 'line',
    data: [
      {x: 10, y: 50},
      {x: 20, y: 56.5},
      {x: 30, y: 46.5},
      // ...
    ]
  }]
});

line-chart-inverted

Connecting points in an arbitrary order

Connecting the points in an arbitrary order, however, is not supported by default. I couldn’t find a Highcharts plugin which supports this either, so I implemented a solution by modifying the getGraphPath function of the series object:

var series = $('#chart').highcharts().series[0];
var oldGetGraphPath = series.getGraphPath;
Object.getPrototypeOf(series).getGraphPath = function(points, nullsAsZeroes, connectCliffs) {
  points = points || this.points;
  points.sort(function(a, b) {
    return a.sortIndex - b.sortIndex;
  });
  return oldGetGraphPath.call(this, points, nullsAsZeroes, connectCliffs);
};

The new function sorts the points by a new property called sortIndex before the line path of the chart gets rendered. This new property must be assigned to each point object of the series data:

series.setData([
  {x: 10, y: 50, sortIndex: 1},
  {x: 20, y: 56.5, sortIndex: 2},
  {x: 30, y: 46.5, sortIndex: 3},
  // ...
], true);

Now we can render charts with points connected in an arbitrary order like this:

A line chart with points connected in arbitrary order
A line chart with points connected in arbitrary order

Modern developer #3: Framework independent JavaScript architecture

Usually small JavaScript projects start with simple wiring of callbacks onto DOM elements. This works fine when it the project is in its initial state. But in a short time it gets out of hand. Now we have spaghetti wiring and callback hell. Often at this point we try to get help by looking at adopting a framework, hoping to that its coded best practices draw us out of the mud. But now our project is tied to the new framework.
In search of another, framework independent way I stumbled upon scalable architecture by Nicholas Zakas.
It starts by defining modules as independent units. This means:

  • separate JavaScript and DOM elements from the rest of the application
  • Modules must not reference other modules
  • Modules may not register callbacks or even reference DOM elements outside their DOM tree
  • To communicate with the outside world, modules can only call the sandbox

The sandbox is a central hub. We use a pub/sub system:

sandbox.publish({type: 'event_type', data: {}});

sandbox.subscribe('event_type', this.callback.bind(this));

Besides being an event bus, the sandbox is responsible for access control and provides the modules with a consistent interface.
Modules are started and stopped (in case of misbehaving) in the application core. You could also use the core as an anti corruption layer for third party libraries.
This architecture gives a frame for implementation. But implementing it raises other questions:

  • how do the modules update their state?
  • where do we call the backend?

Handling state

A global model would reside or be referenced by the application core. In addition every module has its own model. Updates are always done in application event handlers, not directly in the DOM event handlers.
Let me illustrate. Say we have a module with keeps track of selected entries:

function Module(sandbox) {
  this.sandbox = sandbox;
  this.selectedEntries = [];
}

Traditionally our DOM event handler would update our model:

button.on('click', function(e) {
  this.selectedEntries.push(entry);
});

A better way would be to publish an application event, subscribe the module to this event and handle it in the application event handler:

this.sandbox.subscribe('entry_selected', this.entrySelected.bind(this));

Module.prototype.entrySelected = function(event) {
  this.selectedEntries.push(event.entry);
};

button.on('click', function(e) {
  this.sandbox.publish({type: 'entry_selected', entry: entry});
});

Other modules can now listen on selecting entries. The module itself does not need to know who selected the entry. All the internal communication of selection is visible. This makes it possible to use event sourcing.

Calling the backend

No module should directly call the backend. For this a special module called extension is used. Extensions encapsulate cross cutting concerns and shield communication with other systems.

Summary

This architecture keeps UI parts together with their corresponding code, flattens callbacks and centralizes the communication with the help of application events and encapsulates outside communication. On top of that it is simple and small.

Children behind the wheel

What happens when you put children behind the steering wheel? This blog entry looks at the Dunning-Kruger effect in software development.

A few weeks ago, I read a funny news article about a 11 years old boy who stole a bus and drove the normal route with it. When the police stopped him, he had already picked up some passengers and somehow managed to only inflict minor damages along the way. The whole story (in german) can be read here.

Author: Vitaly Volkov, Волков Виталий Сергеевич (user kneiphof) https://commons.wikimedia.org/wiki/User:KneiphofThis blog entry is not about the unheeding passengers, it’s about the little boy and his mindset. This mindset exists in the business world, too. It’s the mixture of “what could possibly go wrong?” with “I’m totally able to pull this off” and a large dose of “everybody else is surely faking it, too”. In the context of the Dunning-Kruger effect, this mixture is called “unskilled and unaware of it”. It’s a dangerous situation for both the employee and the employer, because neither of them can properly evaluate the actual risk.

The Dunning-Kruger effect

Let’s start with the known theory. The Dunning-Kruger effect describes a cognitive bias in which unskilled individuals assess their ability (in the context of a given skill) much higher than it really is and highly skilled individuals tend to underestimate their (relative) competence. The problem is not that real experts tend to be modest about their expertise. The real problem is that “if you’re incompetent, you can’t know you’re incompetent. The skills you need to produce a right answer are exactly the skills you need to recognize what a right answer is.”

So in short: Being unskilled in a certain area probably means you don’t really know that you are unskilled.

Or, translated to our young bus driver: If you don’t know anything about driving a bus, you certainly think you are as much a decent bus driver as the man behind the wheel. It looks easy enough from the passenger seat.

The effect in practice

Why should this bother us in software development? Our education system ensures that we are exposed to enough development practice so that we can counter the “unaware of it” part of the Dunning-Kruger effect. But it isn’t effective enough, at least that’s what I see from time to time.

Every once in a while, I have the opportunity to evaluate an existing code base. Most of the time, it produces a working, profitable application, so it cannot be said to be a failure. Sometimes however, the code base itself is so convoluted, bloated and riddled with poor implementation choices that it absolutely cannot be developed any further without a high risk of regression bugs and/or absurd amounts of developer time for little changes.

These hopeless source codes have one thing in common: they are developed by one person and one person alone. This person has developed for months or even years, showing progress and reporting no problems and suddenly resigned, often shortly before a major milestone in the project like going live with the first version or announcing the next version. The code base now lies abandonded and needs to be adopted. And while no code base is perfect (or should even try to be), this one reeks of desperation and frustration. Often, the application itself is not very demanding, but the source code makes it appear to be.

A good example of this kind of project is my scrap metal tale (in three parts) from five years ago:

A more recent case is littered with inline comments that celebrate small victories of the developer:

  • 5 lines of convoluted, contradicatory statements
  • 3 lines commented out
  • 1 comment line stating “YES! This is finally working! Super!”
  • Still three obvious bugs, resource leaks or security flaws in these few lines alone

The most outrageous (and notorious) case might be the Brillant Paula Bean from the Daily WTF, but this code base is at least readily comprehensible.

The origins of the effect

I think a lot of the frustration, desperation, anxiety and outright fear that I can sense through the comments and code structure was really felt by the original author. It must have been incredibly hard to stay on course, work hard and come up with solutions in the face of imminent deadlines, ever-changing requirements and the lingering fear that you’re just not up to the task. Except that we’ve just learned that developers in the “unskilled and unaware of it” state won’t feel the fear. That’s the origin of all the bad code: The absolute conviction that “it’s not me, it’s the problem, the domain, the language, the compiler, the weather and everything else”. Programming is just hard. Nobody else could do this better. It’ll work in the end. Those “minor problems” (like never actually speaking to the hardware, hard-coded paths and addresses, etc.) will be fixed at the last moment. There’s nothing wrong with an occassional exception stack trace in the logfile and if it bothers you, I can always make the catch-block empty.

The most obvious problem is that these developers think that this is how everybody else develops software, too. That we all don’t bother with concurrency correctness, resource lifecycle management, data structures, graphical user interface design, fault tolerance or even just basic logging. That all programming is hard and frustrating. That finding out where to insert a sleep statement to quench that pesky exception is the pinnacle of developer ingenuity. That things like automated tests, code metrics, continuous integration or even version control are eccentric fads that will pass by and be forgotten soon, so no need to deal with it. That we all just fake it and dread the moment our software is used in the wild.

A possible remedy

The “unskilled and unaware of it” developer isn’t dumb or hopeless, he’s just unskilled. The real tragedy is that he doesn’t have a mentor that can alleviate the biggest problems in the code and show better approaches. The unskilled lonesome developer cannot train himself. A good explanation for this novice “lock-in” can be found in the Dreyfus Model of Skill Acquisition (I recommend you watch the excellent talk on this topic by Dave Thomas): Novices lack the skills for proper self-assessment and cannot learn from their mistakes (as stated by the Dunning-Kruger effect, too). They also cannot recognize them as “their” mistakes or even “mistakes”. They need outside feedback (and guidance) to advance themselves. A mentor’s role is to give exactly that.

Conclusion

If you find yourself in a position of being a “skilled and fairly aware of it” developer, please be aware that you’ve probably been mentored sometimes in the past. Pass it on! Be the mentor for an aspiring junior developer.

If you suspect that you might fall in the “unskilled” category of developers, don’t despair! Being aware of this is the first and most important step. Now you can act strategically to improve your skill. There is a whole book giving you invaluable advice: Apprenticeship Patterns: Guidance for the Aspiring Software Craftsman by Dave Hoover and Adewale Oshineye. Two prominent advices from the book are “Be the Worst (of your team)” and “Find Mentors“. And my most prominent advice? “Don’t stick it out alone“.

Programming is (or should be) fun after all.