For what the javascript!

The setting

We are developing and maintaining an important web application for one of our clients. Our application scrapes a web page and embeds our own content into that page frame.

One day our client told us of an additional block of elements at the bottom of each page. The block had a heading “Image Credits” and a broken image link strangely labeled “inArray”. We did not change anything on our side and the new blocks were not part of the HTML code of the pages.

Ok, so some new Javascript code must be the source of these strange elements on our pages.

The investigation

I started the investigation using the development tools of the browser (using F12). A search for the string “Image Credits” instantly brought me to the right place: A Javascript function called on document.ready(). The code was basically getting all images with a copyright attribute and put the findings in an array with the text as the key and the image url as the value. Then it would iterate over the array and add the copyright information at the bottom of each page.

But wait! Our array was empty and we had no images with copyright attributes. Still the block would be put out. I verified all this using the debugger in the browser and was a bit puzzled at first, especially by the strange name “inArray” that sounded more like code than some copyright information.

The cause

Then I looked at the iteration and it struck me like lightning: The code used for (name in copyrightArray) to iterate over the elements. Sounds correct, but it is not! Now we have to elaborate a bit, especially for all you folks without a special degree in Javascript coding:

In Javascript there is no distinct notion of associative arrays but you can access all enumerable properties of an object using square brackets (taken from Mozillas Javascript docs):

var string1 = "";
var object1 = {a: 1, b: 2, c: 3};

for (var property1 in object1) {
  string1 = string1 + object1[property1];
}

console.log(string1);
// expected output: "123"

In the case of an array the indices are “just enumerable properties with integer names and are otherwise identical to general object properties“.

So in our case we had an array object with a length of 0 and a property called inArray. Where did that come from? Digging further revealed that one of our third-party libraries added a function to the array prototype like so:

Array.prototype.inArray = function (value) {
  var i;
  for (i = 0; i < this.length; i++) {
    if (this[i] === value) {
      return true;
    }
  }
  return false;
};

The solution

Usually you would iterate over an array using the integer index (old school) or better using the more modern and readable for…of (which also works on other iterable types). In this case that does not work because we do not use integer indices but string properties. So you have to use Object.keys().forEach() or check with hasOwnProperty() if in your for…in loop if the property is inherited or not to avoid getting unwanted properties of prototypes.

The takeaways

Iteration in Javascript is hard! See this lengthy discussion…The different constructs are named quite similar an all have subtle differences in behaviour. In addition, libraries can mess with the objects you think you know. So finally some advice from me:

  • Arrays are only true arrays with positive integer indices/property names!
  • Do not mess with the prototypes of well known objects, like our third-party library did…
  • Use for…of to iterate over true arrays or other iterables
  • Do not use associative arrays if other options are available. If you do, make sure to check if the properties are own properties and enumerable.

 

Transforming C-Style arrays in java

Every now and then some customer asks us to fix or improve some important legacy application other people have written. Usually, such projects are fun and it is rewarding to see the improvements both in code and value for the users.

In one of these projects there is a Java GUI application that uses C-style arrays for some of its central data structures:

public class LegoBox  {
  public LegoBrick[] bricks = new LegoBrick[8000];
  public int brickCount = 0;
}

The array-length is a constant upper bound and does not denote the actual elements in the array. Elements are added dynamically to the array and it looks like a typical job for a automatically growing Collection like java.util.ArrayList. Most operations simply iterate over all elements and perform some calculations. But changing such a central part in a performance sensitive application is not only a lot of work but also risky.

We decided to take an incremental approach to improve code readability and maintainability and measured performance with a large, representative dataset between refactorings. There are two easy alternative APIs that improve working with the above data structure.

Imperative API

Smooth migration from the existing imperative “ask”-code (see “Tell, don’t ask”-principle) can be realized by providing an java.util.Iterable to the underlying array.


public int countRedBricks() {
  int redBrickCount = 0;
  for (int i = 0; i < box.brickCount; i++) {
    if (box.bricks[i].isRed()) {
      redBrickCount++;
    }
  }
  return redBrickCount;
}

Code like above is easily transformed to much clearer code like below:

public class LegoBox  {
  public LegoBrick[] bricks = new LegoBrick[8000];
  public int brickCount = 0;

  public Iterable<LegoBrick> allBricks() {
    return Arrays.stream(tr, 0, brickCount).collect(Collectors.toList());
  }
}

public int countRedBricks() {
  int redBrickCount = 0;
  for (LegoBrick brick : box.bricks) {
    if (brick.isRed()) {
      redBrickCount++;
    }
  }
  return redBrickCount;
}

Functional API

A nice alternative to the imperative solution above is a functional interface to the array. In Java 8 and newer we can provide that easily and encapsulate the iteration over our array:

public class LegoBox  {
  public LegoBrick[] bricks = new LegoBrick[8000];
  public int brickCount = 0;

  public <R> R forAllBricks(Function<Brick, R> operation, R identity, BinaryOperator<R> reducer) {
    return Arrays.stream(bricks, 0, brickCount).map(operation).reduce(identity, reducer);
  }

  public void forAllBricks(Consumer<LegoBrick> operation) {
    Arrays.stream(bricks, 0, brickCount).forEach(operation);
  }
}

public int countRedBricks() {
  return box.forAllBricks(brick -> brick.isRed() ? 1 : 0, 0, (sum, current) -> sum + current);
}

The functional methods can be tailored to your specific needs, of course. I just provided two examples for possible functional interfaces and their implementation.

The function + reducer case is a very general interface and used here for an implementation of our “count the red bricks” use case. Alternatively you could implement this use case with a more specific but easier to use filter + count interface:

public class LegoBox  {
  public LegoBrick[] bricks = new LegoBrick[8000];
  public int brickCount = 0;

  public long countBricks(Predicate<Brick> filter) {
    return Arrays.stream(bricks, 0, brickCount).filter(operation).count();
  }
}

public int countRedBricks() {
  return box.countBricks(brick -> brick.isRed());
}

The consumer case is very simple and found a lot in this specific project because mutation of the array elements is a typical operation and all over the place.

The functional API avoids duplicating the iteration all the time and removes the need to access the array or iterable/collection. It is therefore much more in the spirit of “tell”.

Conclusion

The new interfaces allow for much simpler and maintainable client code and remove a lot of duplicated iterations on the client side. They can be introduced on the way when implementing requested features for the customer.

That way we invested only minimal effort in cleaner, better maintainable and more error-proof code. When someday all accesses to the public array are encapsulated we can use the new found freedom to internalize the array and change it to a better fitting data structure like an ArrayList.

Lessons learned developing hybrid web apps (using Apache Cordova)

In the past year we started exploring a new (at leat for us) terrain: hybrid web apps. We already developed mobile web apps and native apps but this year we took a first step into the combination of both worlds. Here are some lessons learned so far.

Just develop a web app

after all the hybrid app is a (mobile) web app at its core, encapsulating the native interactions helped us testing in a browser and iterating much faster. Also clean architecture supports to defer decisions of the environment to the last possible moment.

Chrome remote debugging is a boon

The tools provided by Chrome for remote debugging on Android web views and browser are really great. You can even see and control the remote UI. The app has some redraw problems when the debugger is connected but overall it works great.

Versioning is really important

Developing web apps the user always has the latest version. But since our app can run offline and is installed as a normal Android app you have to have versions. These versions must be visible by the user, so he can tell you what version he runs.

Android app update fails silently

Sometimes updating our app only worked in parts. It seemed that the web view cached some files and didn’t update others. The problem: the updater told the user everything went smoothly. Need to investigate that further…

Cordova plugins helped to speed up

Talking to bluetooth devices? checked. Saving lots of data in a local sqlite? Plugins got you covered. Writing and reading local files? No problemo. There are some great plugins out there covering your needs without going native for yourself.

JavaScript isn’t as bad as you think

Working with JavaScript needs some discipline. But using a clean architecture approach and using our beloved event bus to flatten and exposing all handlers and callbacks makes it a breeze to work with UIs and logic.

SVG is great

Our apps uses a complex visualization which can be edited, changed, moved and zoomed by the user. SVG really helps here and works great with CSS and JavaScript.

Use log files

When your app runs on a mobile device without a connection (to the internet) you need to get information from the device to you. Just a console won’t cut it. You need log files to record the actions and errors the user provokes.

Accessibility is harder than you think

Modern design trends sometimes make it hard to get a good accessibility. Common problems are low contrast, using only icons on buttons, indiscernible touch targets, color as information bearer and touch targets that are too small.

These are just the first lessons we learned tackling hybrid development but we are sure there are more to come.

4 Tips for better CMake

We are doing one of those list posts again! This time, I will share some tips and insights on better CMake. Number four will surprise you! Let’s hop right in:

Tip #1

model dependencies with target_link_libraries

I have written about this before, and this is still my number one tip on CMake. In short: Do not use the old functions that force properties down the file hierarchy such as include_directories. Instead set properties on the targets via target_link_libraries and its siblings target_compile_definitions, target_include_directories and target_compile_options and “inherit” those properties via target_link_libraries from different modules.

Tip #2

always use find_package with REQUIRED

Sure, having optional dependencies is nice, but skipping on REQUIRED is not the way you want to do it. In the worst case, some of your features will just not work if those packages are not found, with no explanation whatsoever. Instead, use explicit feature-toggles (e.g. using option()) that either skip the find_package call or use it with REQUIRED, so the user will know that another lib is needed for this feature.

Tip #3

follow the physical project structure

You want your build setup to be as straight forward as possible. One way to simplify it is to follow the file system and and the artifact structure of your code. That way, you only have one structure to maintain. Use one “top level” file that does your global configuration, e.g. find_package calls and CPack configuration, and then only defers to subdirectories via add_subdirectory. Only for direct subdirectories though: if you need extra levels, those levels should have their own CMake files. Then build exactly one artifact (e.g. add_executable or add_library) per leaf folder.

Tip #4

make install() an option()

It is often desirable to include other libraries directly into your build process. For example, we usually do this with googletest for our unit test. However, if you do that and use your install target, it will also install the googletest headers. That is usually not what you want! Some libraries handle this automagically by only doing the install() calls when they are the top level project. Similar to the find_package tip above, I like to do this with an option() for explicit user control!

Generating done

That is it for today! I hope this is helps and we will all see better CMake code in the future.

About API astonishments

Nowadays we developers tend to stand on the shoulders of giants: We put powerful building-blocks from different libraries together to build something worth man-years in hours. Or we fill-in the missing pieces in a framework infrastructure to create a complete application in just a few days.

While it is great to have such tools in the form of application programmer interfaces (API) at your disposal it is hard to build high quality APIs. There are many examples for widely used APIs, good and bad. What does “bad API” mean? It depends on your view point:

Bad API for the API user

For the application programmer a bad API means things like:

  • Simple tasks/use cases are complicated
  • Complex tasks are impossible or require patching
  • Easy to misuse producing bugs

A very simple real life example of such an API is a C++ camera API I had to use in a project. Our users were able to change the area of interest (AOI) of the picture to produce images consisting of only a part of full resolution images. Our application did crash or not work as expected without obvious reasons. It took many hours of debugging to spot the subtle API misuse that could be verified be reading the documentation:

The value of camera.Width.GetMax() changed instead of being constant! The reason is that AOI was meant and not the sensor resolution width. The full resolution width we actually wanted is obtained by calling camera.WidthMax.GetValue(). This kind of naming makes the properties almost undistinguishable and communicates nothing of the implications. Terms like AOI or sensor width or full resolution just do not appear in this part of the API.

Small things like the example above may really hurt productivity and user experience of an API.

Bad API for the API programmer

API programmers can easily produce APIs that are bad for themselves because they take away too much freedom away resulting in:

  • Frequent breaking changes
  • API rewrites
  • Unimplementable features
  • Confusing, not fitting interfaces

Design your interfaces small and focused. Use types in the interface that leave as much freedom as possible without hurting usability (see Iterable vs. Collection vs. List vs. ArrayList for example). Try to build composable and extendable types because adding types or methods is less of a problem than changing them.

Conclusion

Developers should put extra care in interfaces they want to publish for others to use. Once the API is out there breaking it means angry users. Be aware that good API design is hard and necessary for a painless evolution of an API. Consider reading books like “Practical API Design” or “Build APIs You Won’t Hate” if you want to target a wider audience.

A good name will shine forever

Naming things is supposedly one of the two hard things in Computer Science. Here are some tips on naming for programmers.

Getters

In the Java world property accessors are traditionally prefixed with “get” and “set”, the Java bean convention:

person.getFirstName()

Code becomes more pleasant to read if you omit the “get” prefix:

person.firstName()

Of course, you can do this only if you don’t use a framework that depends on the convention to recognize properties via reflection (like some OR mappers, for example).

What about setters? I rarely write setters anymore. If you design your classes as immutable types you don’t need setters. Even if your class has mutable state you probably want to control this state via methods more specific to the domain of the problem. Also, the more you apply the tell, don’t ask principle the less you will find the need for getters.

Brevity vs. verbosity

There were times when it was common to see mass variable declarations like the following at the beginning of a function:

int i, j, k, l, m, n;
float a, b, c, u, v, x, y, z;

Fortunately, times have changed for the better, and most programmers are aware that descriptive naming is important. However, some programmers do over-compensate. Length of an identifier is not a virtue by itself.

The Objective-C Cocoa framework is famous for overly long method names:

[array objectAtIndex:index]

Parts of Objective-C were inspired by Smalltalk. But in Smalltalk the same method is called at:

[array at:index]

This is a reasonably sufficient name for such a common functionality in programming.

Here’s another example: If the concept of a measurement station is very prevalent in the domain of your project then it’s ok to call instances just station instead of measurementStation if it’s the only kind of station in the domain.

Yes, the IDE does auto-complete long names. However, readability of the code decreases if the reader has to scan the same long-winded names over and over again:

MeasurementStation measurementStation = new MeasurementStation();
Measurement measurement = measurementStation.startMeasurement();

Often you can find names that are more to the point than longer descriptions, e.g. acquire instead of takeOwnershipOf. (source)

Hungarian notation and friends

The famous Hungarian notation is no longer in widespread use. However, there are variations of it that I would recommend against as well for the sake of readability. For example, bookList or bookArray can be simply books. Another variation would be conventions like myField or m_field for member variables. If you need these notations to determine the origin of a variable, then your scopes are probably too big, i.e. your methods, functions or classes are too long. Additionally, IDEs and editors for programmers can highlight these different scopes anyway. Other examples for unnecessary Hungarian-style notation are IFoo for interfaces, EFoo for enums or the infamous FooImpl.

Screaming constants

There is really no need for constants and enum values to constantly SCREAM at you and other readers. This SCREAMING_CASE convention has its origin in C, where constants used to be defined as macros when the const keyword wasn’t introduced yet, and it later found its way into other programming language ecosystems. Names for constants and enum values are not more important than other identifiers and don’t have to be spelled differently. Try it, you will enjoy the newfound silence in your code.

Conclusion

These are some tips to improve readability of code through better names. Some of these tips go against traditional conventions, so you should discuss them with your team before applying them. Consistency within an existing code base can definitely be more important. But if you have the freedom you should definitely give them a try.

Evolvability of Code: Uniform Access Principle

Most programmers like freedom. So there are many means of hiding implementations in modern programming languages, e.g. interfaces in Java, header files in C/C++ and visibility modifiers like private and protected in most object-oriented languages. Even your ordinary functions or public class interface gives you the freedom to change the implementation without needing to touch the clients. Evolvability in this sense means you can change and refine your implementations without requiring others, namely clients of your code, to change.

Changing the class interface or function signatures within a project is often possible and feasible, at least if you have access to all client code and use powerful refactoring tools. If you published your code as a library or do not want to break all client code or forcing them to adapt to your changes you have to consider your interface code to be fixed. This takes away some of your precious freedom. So you have to design your interfaces carefully with evolability in mind.

Some programming languages implement the uniform access principle (UAP) that eases evolvability in that it allows you to migrate from public attributes to properties/method calls without changing the clients: Read and write access to the attribute uses the same syntax as invoking corresponding methods. For clarification an example in Python where you may start with a class like:

class Person(object):
  def __init__(self, name, age):
    self.name = name
    self.age = age

Using the above class is trivial as follows

>>> pete = Person("pete", 32)
>>> print pete.age
32
# a year has passed
>>> pete.age = 33
>>> print pete.age
33

Now if the age is not a plain value anymore but needs checking, like always being greater zero or is calculated based on some calendar you can turn it to a property like so:

class Person(object):
  def __init__(self, name, age):
    self.name = name
    self._age = age

  @property
  def age(self):
    return self._age

  @age.setter
  def age(self, new_age):
    if new_age < 0:
      raise ValueError("Age under 0 is not possible")
    self._age = new_age

Now the nice thing is: The above client code still works without changes!

Scala uses a similar and quite concise mechanism for implementing the UAP wheres .NET provides some special syntax for properties but still migration from public fields easily possible.

So in languages supporting the UAP you can start really simple with public attributes holding the plain value without worrying about some potential future. If you later need more sophisticated stuff like caching, computation of the value, validation or even remote retrieval you can add it using language features without touching or bothering clients.

Unfortunately some powerful and widespread languages like Java and C++ lack support for UAP. Changing a public field to a more complex property means the introduction of getter and setter methods and changing all clients. Therefore you see, especially in Java, many data classes littered with trivial getter and setter pairs doing nothing interesting and introducing unnecessary bloat to maintain the evolvability of the code.