If you have been programming for a while, you will probably, and hopefully, agree that it is preferable to have a sequence of functions as opposed to the same number of functions nested. In other words, call-graph breadth is better than depth. Among other reasons, a “linear” set of instructions is often easier to follow, which is better for humans, and also tends to not go haywire with what memory it touches, which is better for computers.
However, deep call hierarchies occur much more than I would like. I have seen call stacks well beyond 200 functions deep. But this need not be – one can often be turned into the other by transposition. Transposition derives from the latin transponere, which roughly means “to put across”. With matrices, it means swapping rows and columns. Similarly, we can swap call-hierarchy depth for breath.
A couple of months ago, I was tasked with programming a standing-wave display of power-line voltage curves. As you might know, the signal is roughly sine-wave shaped at about 50Hz. The signal is captured in time windows of 200ms, i.e. there’s a new packet of data 5 times a second with 10 sine cycles in it. However, the frequency value jitters just enough to make the signal drift a bit in the 200ms window, i.e. the wave moves forwards and backwards a little bit. The standing-wave feature tries to remove that drift and make it seemingly stationary in our fixed time window, so changes in amplitude become more visible.
The idea seems simple enough for just one signal:
- In the previous wave, search backwards to find the spot where the wave crosses from positive to negative.
- Take the previous wave from that point on and stitch it together with the current one, and cut that off at 200ms of data.
But there is not just one signal, there can be hundreds. And they should all be aligned to one designated “master” signal. So now we add extra steps:
- For all other signals, find the wave packets overlapping (in time) with our new stiched wave packet.
- Order them, and stitch them to a new wave packet covering exactly the same time window.
Now even in this version, finding the right packets for a time interval can be more tricky than it seems, because the values for the signals come in irregularly and can be shifted significantly. So you can just buffer of the last N (5?) packets for each signal and search in there. Still, one more requirement remains. For the display of archived data, the algorithm should work on batches of waves, i.e. many seconds worth, which made step 3 harder by extending the search space. So add:
- For each previous and current pair in a given time-interval:
Now the whole thing was pretty much implemented with steps 0 to 4 being functions calling into the next step, with major loops on the 0th and 3rd step. The wave data flows through these implementation layers vertically, i.e. from step 0 to step 4 and back, but the control flow of the program does not. It flows perpendicular to it, horizontally, solely controlled by the outer-most loop. It is intuitive to write it this way – after all, the control flow follows the flow of time in the data we are processing, but the code was not particularly easy, especially with the search in step 3 becoming unnecessarily complex.
Now let us try transposing this, and match the flow of data with our control flow:
- Gather all relevant signals for the time interval and sort their packets.
- Extract all the “stitching” time codes from the master signal.
- For all signals, traverse pairs together with the time codes and stitch accordingly.
The whole process becomes more digestible, and processing the data in stages made it obvious that sorted data makes using a “merge” type algorithm very easy.
Both algorithms use the same data, but the second makes it explicit, while the first just passes it through the call-stack in chunks.
I have since used this idea of “transposition” a few times to clean up and simplify my designs. It seems especially helpful when trying to decouple messaging from bulk processing.
The idea of looking at the data flow and adapting the control flow to match it, is central to data-oriented design. I argue that while this can be used to optimized programs, transposition is mainly a tool to make programs simpler, which can then lead to optimization. Separating processing into stages is also very similar to loop-fission.
Have you used a technique like this before? Do you, perhaps, know it by another name? Let me know!