Four-way Navigation in UIs

Just yesterday, I was working on the task of enabling gamepad navigation of a graphical UI. I had implemented this before in my game abstractanks but since forgotten how exactly I did it. So I opened the old code and tried to decipher it, and I figured that’d make a nice topic to write about.

Basic implementation

Let’s break down the simple version of the problem: You have a bunch of rectangular controls, and given a specific one, figure out the next one with an input of either left, up, right or down.

This sketch shows a control setup with a possible solution. It also contains an interesting situation: going ‘down’ from B box goes to C, but going up from there goes to A!

The key to creating this solution is a metric that weights the gap for a specific input direction, e.g. neighbor_metric(
box<> const& from, box<> const& to, navigation_direction direction)
. To implement this, we need to convert this gap into numbers we can use. I’ve used a variant of Arvo’s algorithm for that: For both axes, get the difference of the rectangles’ intervals along that axis and store those in a 2d-vector. In code:

template <int axis> inline float difference_on_axis(
box<> const& from, box<> const& to)
{
if (to.min[axis] > from.max[axis])
return to.min[axis] - from.max[axis];
else if (to.max[axis] < from.min[axis])
return to.max[axis] - from.min[axis];
return 0.f;
}

v2<> arvo_vector(box<> const& from, box<> const& to)
{
return {
difference_on_axis<0>(from, to),
difference_on_axis<1>(from, to) };
}

That sketch shows the resulting vectors from the box in the top-left going to two other boxes. Note that these vectors are quite different from the difference of the boxes’ centers. In the case of the two top boxes, the vector connecting the centers would tilt down slightly, while this one is completely parallel to the x axis.

Now armed with this vector, let’s look at the metric I was using. It results in a 2d ‘score’ that is later compared lexicographically to determine the best candidate: the first number determines the ‘angle’ with the selected axis, the other one the distance.

template <int axis> auto metric_on_axis(box<> const& from, box<> const& to)
{
auto delta = arvo_vector(from, to);
delta[0] *= delta[0];
delta[1] *= delta[1];
auto square_distance = delta[0] + delta[1];

float cosine_squared = delta[axis] / square_distance;
return std::make_pair(-cosine_squared, delta[axis]);
}

std::optional<std::pair<float, float>> neighbor_metric(
box<> const& from, box<> const& to, navigation_direction direction)
{
switch (direction)
{
default:
case navigation_direction::right:
{
if (from.max[0] >= to.max[0])
return {};
return metric_on_axis<0>(from, to);
}
case navigation_direction::left:
{
if (from.min[0] <= to.min[0])
return {};
return metric_on_axis<0>(from, to);
}
case navigation_direction::up:
{
if (from.max[1] >= to.max[1])
return {};
return metric_on_axis<1>(from, to);
}
case navigation_direction::down:
{
if (from.min[1] <= to.min[1])
return {};
return metric_on_axis<1>(from, to);
}
}
}

In practice this means that the algorithm will favor connections that best align with the input direction, while ties resolved by using the closest candidate. The metric ‘disqualifies’ candidates going backward, e.g. when going right, the next box cannot start left of the from box.

Now we just need to loop through all candidates and the select the one with the lowest metric.

This algorithm does not make any guarantees that all controls will be accessible, but that is a property that can easily be tested by traversing the graph induced by this metric, and the UI can be designed appropriately. It also does not try to be symmetric, e.g. going down then up does not always result in going back to the previous control. As we can see in the first sketch, this is not always desirable. I think it’s nice to be able to go from B to C via ‘down’, but I’d be weird to go ‘up’ back there instead of A. Instead, going ‘right’ to B does make sense.

Hard cases

But there can be ambiguities that this algorithm does not quite solve. Consider the case were C is wider, so that is is also under B:

The algorithm will connect both A and B down to C, but the metric will be tied for A and B going up from C. The metric could be extended to also include the ‘cross’ axis min-point of the box, e.g. favoring left over right for westerners like me. But going from B down to C and then up to A would feel weird. One idea to resolve this is to use the history to break ties, e.g. when coming from B to C, going back up would go back to C.

Another hard case is scroll-views. In fact, they seem to change the problem domain. Instead of treating the inputs as boxes in a flat plane, navigating in a scroll view requires to navigate to potentially only partially visible or even invisible boxes and bringing them into view. I’ve previously solved this by treating every scroll-view as its own separate plane and navigating only within that if possible. Only when no target is found within the scroll-view, did the algorithm try to navigate to items outside.

My Favorite Pattern

It has become somewhat of an internal meme that I do not like it when programmers use the word “wrapper”. When someone does say it, I usually get a cue from one of the others to start complaining about it. Do not get me wrong, though. I am very much in favor of wrapping things, but with purpose. And my favorite one is the façade.

When simple becomes complex

Many times, APIs start out simple and elegant. This usually works for a while and the API gets used a lot precisely because of its beauty and simplicity. But eventually, a new use case comes along that demands more of the API than it can currently serve. It has to be extended. This usually takes the form of an additional method or function parameter, or an additional function that needs to be called. Using the API now becomes more complex all its users.

Do not underestimate this effect. I have only anecdotal evidence, but in my experience, a lot of unnecessary software complexity can be attributed to this1. The Pareto-Principle applies here: A single use case causes all the users of the previously simple API to deal with new complexity (e.g. 10% of the use cases cause 90% of the complexity in the user-/call-sites).

Façades make it look beautiful

Luckily, it can be dealt with beautifully: using the façade pattern. This pattern abstracts a complex API behind a simple API. The trade-off, of course, is that it is less powerful than the “full API”. In our example though, all of the previous use-cases can keep using the simple API via a façade.

When to apply this

The aforementioned example, extending an API, is a very nice opportunity to apply the façade. Just keep the interface of the old API around, and re-implement it using the new, extended API, which is usually created by modifying the old API’s implementation. Now all the old call-sites can stay the same, yet you can have a more powerful API for those rare cases that need it.

Of course, you can also identify common usage patterns and refactor them using a façade, but that’s usually much harder to do.

What exactly are façades made of?

Façades do not hide the more complex API in the sense that the APIs users are not allowed to use it. Yes, façades make APIs look beautiful, but that is where the metaphor ends. You can still access what is behind the façade. You can even write more façades for the behind. Many APIs have multiple common cases and only very few complex ones.

So… Classes? Functions? Data? Any of those, in fact. Whenever you enable writing something in a simpler way for a common case, you have a façade . Very often, a small function with a simple signature is all the façade you need.

But it makes all the difference.

Now can someone please tell me what that little hook under the c is called?

  1. Façades can, of course, also contribute to creating complexity by growing the codebase and creating ‘variants’. But they rarely do. ↩︎

Exploring Tango Admin Devices

We are using the open-source control system framework TANGO in several projects where coordinated control of multiple hardware systems is needed.

What is TANGO good for?

TANGO provides uniform, distributed access and control of heterogeneous hardware devices. It is object-oriented in nature and usually one hardware device is represented by one (or more) software devices.

The device drivers can be written in either C++, Java or Python and client libraries exist for all of these languages. Using a middleware adapter like TangoGQL any language can access the devices.

All that makes TANGO useful for building SCADA systems ranging from a handful controlled devices to several hundreds you want to supervise and control.

Lesser know features of TANGO

All of the above is well known in the TANGO and SCADA community and quite straightforward. What some people may not know is that TANGO automatically provides an Admin-device for each TANGO server (an executable running one or more TANGO devices).

These admin devices have an address of the form dserver/<server_name>/<instance_name> and provide numerous commands for controlling and querying the TANGO device server instance:

You can for example introspect the device server to find available device classes, device instances and needed device properties (think of them as configuration settings).

In addition to introspection you can also control some aspects of the TANGO server like polling and logging. The Admin-device also allows restarting individual devices or even the whole server instance. This can be very useful to apply configuration changes remotely without shell access or something similar to the remote machine.

Wrapping it up

Admin-devices automatically exist and run for each TANGO device server. Using them allows clients to explore what devices are available, what they offer and how they can be configured. They also allow some aspects to be changed remotely at runtime.

We use these features to provide a rich web-base UI for managing the control system in a convenient way instead of relying on the basic tools (like Jive and Astor) that TANGO offers out-of-the-box.

Avoid special values of the result type for error indication

As many of you may know we work with a variety of programming languages and ecosystems with very different code bases. Sometimes it may be a modern green field project using state of the art frameworks. At other times it may be a dreaded legacy project initially written many years ago (either by us or someone we do not even know) using ancient languages and frameworks like really old java stuff (pre jdk 7) or C++ (pre C++11), for example.

These old projects could not use features of modern incarnations of these languages/compilers/environments – and that is fine with me. We usually gradually modernize such systems and try to update the places where we come along to fix some issues or implement new features.

Over the years I have come across a pattern that I think is dangerous and easily leads to bugs and harder to maintain code:

Special values of the resulting type of a function to indicate errors

The examples are so numerous and not confined to a certain programming environment that they urged me to write this article. Maybe some developers using this practice will change their mind and add a few tools to their box to write safer and more expressive code.

A simple example

Let us image a function that returns a simple integer number like this:

/**
 * Here we talk to a hardware sensor. If everything works, we should
 * get a value between -50 °C and +50 °C.
 * If something goes wrong, we return -9999.
int readAmbientTemperature();

Given the documentation, clients can surely use this kind of function and if every use site interprets the result correctly, nothing will ever go wrong. The problem here is, that we need a lot of domain knowledge and that we have to check for the special value.

If we use this pattern for other values where the value range is not that clearly bounded we may either run into problems or invent other “impossible values” for each use case.

If we forget to check for the special value the users may see it an be confused or even worse it could be used in calculations.

The problem even gets worse with more flexible types like floating point numbers or strings where it is harder to compare and divide valid results from failure indicators.

Classic error message that mixes technical code and error message in a confusing, albeit funny sentence (Source: Interface Hall Of Shame)

Of course, there are slightly better alternatives like negative numbers in a positive-only domain function or MAX_INT, NaN or the like provided by most languages.

I do not find any of the above satisfying and good enough for production use.

Better alternatives

Many may argue, that their environment lacks features to implement distinct error indicators and values but I tend to disagree and would like to name a few widely used alternatives for very different languages and environments:

  • Return codes and out-parameters for C-like languages like in the unix and win32 APIs (despite all their other flaws… 😀 )
  • Exceptions for Java, Python, .NET and maybe in some cases even C++ with sufficiently specific type and details to differentiate different failures
  • Optional return types when the failures do not need special handling and absence of a value is enough
  • HTTP status code (e.g. 400 or 404) and a JSON object containing reason and details instead of a 2xx status with the value
  • A result struct or object containing execution status and either a value on success or error details on failure

Conclusion

I am aware that I probably spent way too much words on such a basic topic but I think the number of times I have encountered such a style – especially in code of autodidacts, but also professionals – justifies such an article in my opinion. I hope I provided some inspiration for those who do not know better or those who want to help others improve.

Have we made things too easy?

One of the old mantras for API design is “Make doing the right thing easy and the wrong thing hard”. This, of course, applies to much broader topics as well, such as software development or UX.

For software development specifically, are we maybe making “doing the wrong thing” too easy as well? Here are a two examples:

Web Requests

In the old times, requesting data from a web server required first setting up the request, sending it, and then getting the result back to your application either via polling or callbacks. Dave Mark once adequately called this solving the “waiting problem”. It was cumbersome, to say the least. It was clear that making such a request was something to be avoided. You did it when you had to, but you avoided setting up too many different kinds of requests implictly.

Nowadays, with the advent anonymous functions/lambdas in most mainstream programming languages, continuations became the new way handle these things: do_request(...).then(result -> ...) This already made this a lot easier. And even better, now we have some form of coroutines in many languages were you can just do result = await do_request(...). It even looks almost like a normal function call.

With this, programmers can just do requests one after the other. Need one thing from a server? Do one request. Need ten things from a server? Do ten requests. Of course, this is horribly wasteful: each request will incur the full overhead of http/https and a server roundtrip. In the old times, doing the request was painful, so you automatically looked for ways to avoid doing more, and bundle your asks into one request, argueable leading to a better program.

Dependencies

Before nice package-managers where a thing, handling dependencies was a huge pain. You would have to manually get, unpack, configure and install the dependency for each developer and/or consumer system. As a consequence, libraries were big and often duplicated foundational things. But it also caused developers carefully grooming their library selections.

Now with package managers, libraries have started to become small. Duplication within libraries certainly seems to have decreased, and the average library size has decreased. But this also caused developers to be much less cautious when adopting a dependency, with package managers handling thousands of dependencies that no one developer can possibly have a full understanding of. And this then leads to things like the leftpad disaster.

Better or worse?

I am pretty sure that both having nice abstractions to deal with asynchronicity and package managers are good things. But if they make certain things too easy, how can we deal with that? The only thing I can currently think of is figuratively sticking warning-labels on these things during review time, but because those things are now so easy and subtle, it is also easy to miss them.

Are there other examples were we maybe made the wrong thing too easy? Do you have any ideas how to deal with this problem?

Improving my C++ time queue

Another code snippet that can be found in a few of my projects is the “time queue”, which is a simple ‘priority queue’ style data structure that I use to defer actions to a later time.

With this specific data structure, I have multiple implementations that clearly came from the same source. One indicator for that is a snarky comment in both about how std::list is clearly not the best choice for the underlying data structure. They have diverged a bit since then though.

Requirements

In my use case not use time points, but only durations in standard-library nomenclature. This is a pretty restrictive requirement, because otherwise any priority queue (e.g. from boost or even from the standard library) can be used quite well. On the other hand, it allows me to use floating-point durations with predictable accuracy. The queue has two important functions:

  1. insert to insert a timeout duration and a payload.
  2. tick is called with a specific duration and then reports the payloads that have timed out since their insertions.

Typically tick is called a lot more frequently than insert, and it should be fast. The payload is typically something like a std::function or an id for a state-machine that needs to be pulsed.

The basic idea is to only keep the duration difference to the previous item in the list. Only the first item keeps its total timeout. This way, when tick is called, usually only the first item needs to be updated. tick only has to touch more items when they time out.

Simple Implementation

One of the implementations for void insert(TimeType timeout, PayloadType payload) looks like this:

if (tick_active_)
{
  deferred_.push_back({ .remaining = after, .payload = std::move(payload) });
  return;
}

auto i = queue_.begin();
for (; i != queue_.end() && timeout > i->remaining; ++i)
  timeout -= i->remaining;

if (i != queue_.end())
  i->remaining -= timeout;

queue_.insert(i, { .remaining = after, .payload = std::move(payload) });

There is a special case there that guards against inserting into queue_ (which is still a very bad std::list) by instead inserting into deferred_ (which is a std::vector, phew). We will see why this is useful in the implementation for template void tick(TimeType delta, Executor execute):

tick_active_ = true;
auto i = queue_.begin();
for (; i != queue_.end() && delta >= i->remaining; ++i)
{
  delta -= i->remaining;
  execute(i->payload);
}

if (i != queue_.end())
  i->remaining -= delta;

queue_.erase(queue_.begin(), i);
tick_active_ = false;

while (!deferred_.empty())
{
  auto& entry = deferred_.back();
  insert(entry.remaining, std::move(entry.payload));
  deferred_.pop_back();
}

The timed out items are reported via a callback that is supplied as Executor execute. Of course, these can do anything, including inserting new items, which can invalidate the iterator. This is a common use case, in fact, as many deferred actions will naturally want follow ups (let’s ignore for the moment that the implementation is nowhere near exception safe…). The items that were deferred to deferred_ in insert get added to queue_ after the iteration is complete.

This worked well enough to ship, but the other implementation had another good idea. Instead of reporting the timed-out items to a callback, it just returned them in a vector. The whole tick_active_ guard becomes unnecessary, as any processing on the returned items is naturally deferred until after the iteration:

std::vector<PayloadType> tick(TimeType delta)
{
  std::vector<PayloadType> result;
  auto i = queue_.begin();
  for (; i != queue_.end() && delta >= i->remaining; ++i)
  {
    delta -= i->remaining;
    result.push_back(i->payload);
  }

  if (i != queue_.end())
    i->remaining -= delta;

  queue_.erase(queue_.begin(), i);
  return result;
}

This solves the insert-while-tick problem, and lets us use the result neatly in a range-based for-loop like this: for (auto const& payload : queue.tick(delta)) {}. Which I personally always find a little bit nicer than inversion-of-control. However, the cost is at least one extra allocation for timed-out items. This might be acceptable, but maybe we can do better for very little extra complexity.

Return of the second list

Edit: The previous version of this article tried to keep the timed-out items at the beginning of the vector before returning them as a std::span. As commenter Steffen pointed out, this again prevents us from inserting while iterating on the result, as any insert might invalidate the backing-vector.

We can get rid of the allocation for most of the tick calls, even if they return a non-empty list. Remember that a std::vector does not deallocate its capacity even when it’s cleared unless that is explicitly requested, e.g. via shrink_to_fit. So instead of returning a new vector each time, we’re keeping one around for the timed out items and return a const-ref to it from tick:

std::vector<PayloadType> const& tick(TimeType delta)
{
  timed_out_.clear();
  auto i = queue_.begin();
  for (; i != queue_.end() && delta >= i->remaining; ++i)
  {
    delta -= i->remaining;
    timed_out_.push_back(std::move(i->payload));
  }

  if (i != queue_.end())
    i->remaining -= delta;

  queue_.erase(queue_.begin(), i);
  return timed_out_;
}

This solution is pretty similar to the deferred list from the first version, but instead of ‘locking’ the main list while iterating, we’re now separating the items we’re iterating on.