Author: Matthias Weinreuter

Finding the culprit in massive-components-interactions web apps

So, one of the biggest software developer joys is, when a customer, after you developed some software for them a few years back, is returning to you with change requests and it turns out that the reason you didn’t hear much from them was not complete abandonment of their project, or utter mistrust in your capabilites, or – you name it – but just that the software worked without any problems. It just did its job.

But one of the next biggest joys is, then, to actually experience how your software behaves after their staff – also just doing their job – growed their database up to a huge number of records and now you see your system under pressure, with all parts playing each other in large scales.

Now, with interactions like on-the-fly-updates during mouse gestures like drag’n’drop, falling behind real-time can be more than a minor nuisance; this might go against a core purpose of such a gesture. But how do you troubleshoot an application which is behaving smoothly most of the time, but then have very costly computations during short periods of time?

Naively, one would just put logging statements in every part that can possibly want to update, but this very quickly meets its limit when your interface just has so many components on display, and you can’t just check that one interaction on one singular part?

This is where I thought of this small device, which I now wanted to share. In my case, this was a React application, but we just use a single object, defined globally, which is modified by a single function like this


const debugBatch = {
    seen: [],
    since: null,
    timeoutId: null
};

const logBatched = (id)=> {
    if (debugBatch.timeoutId) {
        debugBatch.seen.push(id);
    } else {
        debugBatch.since = new Date();
        debugBatch.seen = [];
        debugBatch.timeoutId = setTimeout(() => {
            console.log("[UPDATED],", debugBatch);
            debugBatch.timeoutId = null;
        }, 1000);
    }
};

const SomeComplicatedComponent = ({id, ...}) => {
    ...

    logBatched(id);

    return <>...</>;
};

and I do feel compelled to emphasize that this completely goes outside the React way of thinking, taking care with every state to follow the framework-internal logic. Not some global object, i.e. shared memory that any component can modify at will. But while for production code, I would hardly advise doing so, our debugging profits from that ghostly setup of “every component just raises the I-was-updated-information to somewhere out of its realm”.

It just uses one global timeout, giving a short time frame during which every repeated function call of logBatched(...); raises one entry in the “seen” field of our global object; and when that collection period is over, you get one batched output containing all the information. And you can easily extend that by passing more information along, or maybe, replacing that seen: [] with a new Set() if registering multiple updates of the same component is not what you want. (Also, the timestamp is there just out of habit, it’s not magically required or so).

Note that you can do all additional processing of “how do I need to prepare my debugging statements in order to actually see who the real culprits are” after collecting, and done in an extra task, can even be as expensive as you want it to be without blocking your rendering. As in, having debugging code that significantly affects the actual problem that you are trying to understand, is especially probable in such real-time user interactions; which means you are prone to chasing phantoms.

I like this thing because of its simplicity, and particularly, because it employs a way of thinking that would instinctively make me doubt the qualification of anyone who would give me such a piece to review (😉) but for that specific use case, I’d say, does the job pretty well.

Nested queries like N+1 in practice: a 840-fold performance gain

Once every while, I remember a public software engineering talk, held by Daniel and visited by me (not only), at a time before I accidentally started working for the Softwareschneiderei for several years, which turned out splendid – but I digress – and while the topic of that talk was (clearly) a culinary excursion about… how onions are superior to spaghetti… one side discussion spawned that quote (originally attributed to Michael Jackson e.g. here)

You know the First Rule of Optimization?

… don’t.

You know the Second Rule?

… don’t do it YET.

I like that a lot, because while we’ve all been at places where we knew a certain algorithm to be theoretically sub-optimal, measures of optimal is also “will it make my customer happy in a given time” and “will our programming resources be used at the most effective questions at hand”.

And that is especially tricky in software in which a customer starts with a wish like “I want a certain view on my data that I never had before” (thinking of spreadsheets, anyone?) and because they do not know what they will learn from that, no specification can even be thought-out. Call it “research” or “just horsing around”, but they will be thankful if you have a robust software that does the job, and customers can be very forgiving about speed issues, especially when they accumulate slowly over time.

Now we all know what N+1 query problems are (see e.g. Frederik’s post from some weeks ago), and even without databases, we are generally wary about any nested loops. But given the circumstances, one might write specific queries where you do allow yourself some.

There it not only “I have no time”. Sometimes you can produce much more readable code by doing nested queries. It can make sense. I mean, LINQ and the likes have a telling appearance, e.g. one can read

var plansForOffer = db.PlansFor(week).Where(p => p.Offers.Contains(offer.Id));
if (plansForOffer.Any()) { 
  lastDay = plansForOffer.Max(p => p.ProposedDay);
}

surely quite well; but here alone do the .Any() and the .Max() loop over similar data needlesly, and probably the helper PlansFor(…) does something like it, and then run that in a loop over many “offers” and quickly, there goes your performance down the drain only because your customers have now 40 instead of 20 entities.

To cut a long story short – with given .NET and Entity Framework in place, and in the situation of now-the-customer-finds-it-odd-that-some-queries-take-seven-minutes-to-completion, I did some profiling. In software where there are few users and ideally one instance of the software on one dedicated machine, memory is not the issue. So I contrasted the readable, “agile” version with some queries at startup and converged at the following pattern

var allOffers = await db.Offers
    .Where(o => ... whatever interests you ... )
    .Include(o => ... whatever EF relations you need ...)
    .ToListAsync();
var allOfferIds = allOffsets
    .Select(o => o.Id)
    .ToArray();
var allPlans = await db.PlanningUnits
    .Where(p => p.Contains(offer.Id))
    .AsNoTracking()
    .ToListAsync();
var allPlanIds = allPlans
    .Select(p => p.Id)
    .ToArray();
var allAppointments = await db.Appointments
    .Where(a => allPlanIds.Contains(a.PlanId))
    .AsNoTracking()
    .ToListAsync();
var requiredSupervisors = allAppointments
    .Select(a => a.Supervisor)
    .Distinct()
    .ToArray();
var requiredRooms = await db.Rooms
    .Where(r => requiredSupervisors.Contains(r.SupervisorId)
    .AsNoTracking()
    .ToDictionaryAsync(r => r.Id);

// ...

return (
    allOffers
    .Select(o => CollectEverythingFromMemory(o, week, allOffers, allPlans, allAppointments, ...))
    .ToDictionary(o => o.Id);
);

So the patterns are

Collect every data pool that you will possibly query as completely as you need
Load it into memory as .ToList() and where you can – e.g. in a server application, use await … .ToListAsync() in order to ease the the request thread pool
With ID lists and subsequent .Contains(), the .ToArray() call is even enough as arrays come with less overhead – although this is less important here.
Use .ToDictionaryAsync() for constant-time lookups (although more modern .NET/EF versions might have advanced functions there, this is the more basic fallback that worked for me)
Also, note the .AsNoTracking() where you are querying the database only (if not mutating any of this data), so EF can save all the memory overhead.

With that pattern, all inner queries transform to quite efficient in-memory-filtering of appearance like

var lastDay = allPlans
    .Where(p => p.Week == week
        && allPlanIds.Contains(p.Id)
        && planOfferIds.Contains(offer.Id)
    )
    .Max(p => p.ProposedDay);

and while this greatly blew up function signatures, and also required some duplication because these inner functions are not as modular, as stand-alone anymore, our customers now enjoyed a reduction in query size of really

Before: ~ 7 Minutes
After: ~ half a second

i.e. a 840-fold increase in performance.

Conclusion: not regretting much

It is somewhat of a dilemma. The point of “optimization…? don’t!” in its formulation as First and Second Rule holds true. I would not have written the first version of the algorithm in that consolidated .ToList(), .ToArray()., ToDictionary() shape when still in a phase where the customer gets new ideas every few days. You will need code that is easier to change, and easier to reckon about.

By the way – cf. the source again – ,the Third Rule is “Profile Before Optimizing”. When dealing with performance, it’s always crucial to find the largest culprit first.

And then, know that there are general structures which make such queries efficient. I can apply these thoughts to other projects and probably do not have to finely dig into the exact details of other algorithms, I just need to make that trade-off above.

Make your users happy by not caring about how they would enter data

As many of our customers are operating somewhere in the technical or scientific field, we happen to get requests that are somewhat of the form

Well, so we got that spreadsheet managing our most important data and well…
… it’s growing over our heads with 1000+ entries…
… here are, sadly, multiple conflicting versions of it …
… no one understands these formulae that Kevin designed …
… we just want ONE super-concise number / graph from it …
… actually, all of the above.

And while that scenario might be exaggerated for the sole reason of entertainment, at its core there’s something quite universal. Now there’s one thing to feel like the only adult in the room, throwing one’s hands up into the air, and then calculating the minimum expenses for a self-contained solution and while doing so, we still would strive for a simple proposal, that’s always quite a tightrope walk between the “this really needs not to be fancy” and the “but we can not violate certain standards either”.

And in such a case, your customer might happily agree with you – they do not want to rely on their spreadsheets, it just was too easy just to start with. And now they’re addicted.

The point of this post is this: If your customer has tabular data, there’s almost no “simple” data input interface you can imagine that is not absolutely outgunned by spreadsheets, or even text files. The sheer amount of conventions that nearly everyone already knows, don’t try to compete with some input fields and “Add entity” buttons and HTML <form>s.

This is no praise towards spreadsheet software at all. But consider the hidden costs that can wait behind a well-intended “simple data input form”.

For the very first draft of your software

do not build any form of data input interface, if you cannot exactly know that it will quickly empower the user to something they can not do already.
Do very boldly state: Keep your spreadsheets, CSV, text files. I will work with them, not against them.
E.g. Offer something as “uncool” as a “(up) load file” button, or a field to paste generic text. You do not have to parse .xls, .xlsx, .ods files either. The message should be: Use whatever software you want to use, up to the point where it started to be a liability, not your friend anymore.
Do not be scared about “but what if they have wrong formatting”. Well then find the middle ground – what kind of data should you really sanitize, what can you straightout reject?

Make it clear that you prioritize the actual problem of your customer. Examples:

if that problem is “too many sources-of-not-really-truth-anymore”, then your actual solution should focus on merging such data sources. But if that was the main problem of your customer, imagine them to always have some new long-lost-data-sources on some floppy disk, stashed in their comic book collection, or something.
if that problem is “we cannot see through that mess”, then your actual solution should focus on analyzing, on visualizing the important quantities.
if the problem is that Kevin starts writing new crazy formulae everyday and your customer is likely go out of business whenever he chooses to quit, then yes, maybe your actual solution is really in constructing a road block, but then rather in a long discussion about the actual use cases, not by just translating the crazy logic into Rust now, because “that’s safer”.

It is my observation, that a technically-accustomed customer is usually quite flexible in their ways. In some cases it really is your job to impose a static structure, and these are the cases in which they will probably approve of having that done to them. In other cases, e.g. their problem is a shape-shifting menace that they cannot even grasp, do not start with arbitrary abstractions, templates, schemas that will just slow them down.
I mean – either you spend your day migrating their data all the time, and putting new input elements on their screen, or you spend your day solving their actual troubles.

Getting rid of all the files in ASP.NET “Single File” builds

I’ve come to finding it somewhat amusing, that if you build a fresh .NET project, like,

dotnet publish ./Blah.csproj ... -c Release -p:PublishSingleFile=true

that the published “Single File” contains of quite a handful of files. So do I, having studied physics and all that, have an outdated knowledge of the concept of “single”? Or is there a very specific reason for any of the files that appear that and it just has to be?
I really needed to figure that out for a small web server application (so, ASP.NET), and as we promised our customer a small, simple application; all the extra non-actually-single files were distracting spam at least, and at worst it would suggest a level of complexity to them that wasn’t helpful, or suggest a level of we-don’t-actually-care when someone opens that directory.

So what I found to do a lot was to extend my .csproj file with the following entries, which I want to explain shortly:


  <PropertyGroup Condition="'$(Configuration)'=='Release'">
    <DebugSymbols>false</DebugSymbols>
    <DebugType>None</DebugType>
  </PropertyGroup>

  <PropertyGroup>
    <PublishIISAssets>false</PublishIISAssets>
    <AspNetCoreHostingModel>OutOfProcess</AspNetCoreHostingModel>
  </PropertyGroup>	

  <ItemGroup Condition="'$(Configuration)'=='Release'">
    <Content Remove="*.Development.json" />
    <Content Remove="wwwroot\**" />
  </ItemGroup>

  <ItemGroup>
      <EmbeddedResource Include="wwwroot\**" />
  </ItemGroup>

The first block gets rid of any *.pdb file, which stands for “Program Debug Database” – these contain debugging symbols, line numbers, etc. that are helpful for diagnostics when the program crashes. Unless you have a very tech-savvy customer that wants to work hands-on when a rare crash occurs, end users do not need them.
The Second block gets rid of the files aspnetcorev2_inprocess.dll and web.config. These are the Windows “Internet Information Services” module and configuration XML that can be useful when an application needs tight integration with the environmental OS features (authentication and the likes), but not for a standalone web application that just does a simple thing.
And then the last two blocks can be understood nearly as simple as they are written – while a customer might find an appsettings.json useful, any appsettings.Development.json is not relevant in their production release,
and for a ASP.NET application, any web UI content conventionally lives in a wwwroot/ folder. While the app needs them, they might not need to be visible to the customer – so the combination of <Content Remove…/> and <EmbeddedResource Include…/> does exactly that.

Maybe this can help you, too, in publishing cleaner packages to your customers, especially when “I just want a simple tool” should mean exactly that.

Think about the Where of your Comments

Comments are a bit tricky to argue about, because of so many pre-existing conceptions – and the spectrum somewhat ranges from a) insecure customers who would prefer that every line of code has its own comment to z) overconfident programmers who believe that every comment is outdated in the second it is written. Major problems with these approaches are

a) leads to a lot of time and keystrokes wasted for stuff that is self-explanatory at best and, indeed, outdated in the near future;

z) some implementations really can only be understood with further knowledge, and this information really has to be attached right to the place where it is used (“co-located”).

So naturally, one should neither dimiss them as a whole, but also really care about whether they actually transport any useful information. That is, useful for any potential co-developer (that is most likely yourself-in-a-few-weeks).

The question of “Whether” might stay most important – because most comments can really be thrown out after giving your variables and functions descriptive names, but lately I found that there can be quite a difference in readability with various ways of placing a comment.

Compare the reading flow of the following:

class EntityManager:
  entities: List[Entity]
  session: Session
  other_properties: OtherProperties

  # we cache the current entity because its lookup can take time
  _current_entity: Optional[EntityInstance] = None

  _more_properties: MoreProperties

  ...

vs.

class EntityManager:
  entities: List[Entity]
  session: Session
  other_properties: OtherProperties

  _current_entity: Optional[EntityInstance] = None
  # we cache the current entity because its lookup can take time

  _more_properties: MoreProperties

  ...

Now I suppose that if one is not accustomed, the second one might just look wrong because comments are usually placed either before the line or to the right of it. The latter option is left out here because I figure it should only be used when commenting on the the value of this declaration, not the general idea behind it (and making me scroll horizontally is a mischevious assault on my overall work flow; that is: motivation with working with you again).

But after I tried around a bit, my stance is that there are at least two distinct categories of comments,

one that is more of a “caption”, the one where I as a developer have to interrupt your reading flow in order to prevent you from mistakes or to explain a peculiar concept – these ones I would place above.
one that is more of an “annotation”, that should come naturally after the reader has already read the line it is refering to. Think about it like a post in some forum thread.

For some reason the second type does not seem to be used much. But I see their value, and having them might be preferable than trying to squeeze them above the line (where they do damage) or to remove them (where they obviously can not facilitate understanding).

One of my latter attempts is writing them with an eye-catching prefix like

index = max_count - 1
# <-- looping in reverse because of http://link.to.our.issue.tracker/

because it emphasizes the “attachment” nature of this type of comment, but I’m still evaluating the aesthetics of that.

Remember, readability can be stated as a goal in itself because it usually is tightly linked to maintainability, thus reducing future complications. Some comments are just visual noise, but others can deliver useful context. But throwing the “useful context” in the face of your reader before them even being half awake will not make them happy either, and therefore – maybe think about the placing of your comments in order to keep your code like a readable story.

Generalizations

It might well be that you dislike my quick examples above, which were not real examples. But the “think of where” also applies to other structures like

if entity_id is None:
    # this can happen due to ...
    return None

vs.

# TODO: remove None as a possible value in upcoming rework
if entity_id is None:
    return None

vs.

if entity_id is None:
    return None
    # <-- quit silently rather than raising an exception because ...

or similar with top-level comments vs. placing them inside class definitions.

You see, there are valid use cases for all varieties, but they serve difference intentions each.

Save yourself from releasing garbage with Git Hooks

Times do happen, where one would write code that should please, please not land in the production release, like for example:

Overwriting a certain URL with a local one
Having a dialog always-open in order to efficiently style it
or generally, mocking code of something that is not-important-right-now™

And then we all already know that such shortcuts tend to stay in the code longer than intended. One might tell oneself:

Oh, I will mark this as // TODO: Remove ASAP

This is the feature branch, so either me or the Code Review will catch it when merging on main.

And this is better than nothing – some Git clients might disallow you from even committing code with any //TODO, but I feel that this is direct violation of the idea of “Commit Early, Commit Often”. As in, now you’re bound to finish your feature before you commit. This is the opposite of helping. By now you figure:

Thanks for nothing, I will just rename this as // TO_REMOVE

Rest of above logic applies untampered with.

These keyword-in-comments are sometimes called Code Tags (e.g. here), and here we just worked around the point that //TODO is more commonly used than other tags, but of course, these still are comments of no specific meaning.

You might now be able to push again, but still – say, the Code Reviewer does the heinous mistake of trusting you too much – this might lead to code in the official release that behaves so silly that it just makes every customer question your mental sanity, or worse.

Git Hooks can help you with appearing more sane than you are.

These are bash scripts in your specific repository instance (i.e. your local clone, or the server copy, individually) to run at specific points in the Git workflow.

For our particular use case, two hooks are of interest:

A pre-receive hook run on the server-side repository instance that can prevent the main branch from receiving your dumb development code. This sounds more rigid, but you need access to the server hosting the (bare) repository.
A pre-push hook run on your local repository instance that can prevent you from pushing your toxic waste to the branch in question. Keep in mind that if you do your merging unto main via GitLab Merge Requests etc. that this hook will not run then – but implementing it is easier because you already have all the access.

The local pre-push hook is as simple as adding a file named pre-push in the .git/hooks subfolder. It needs to be executable – (which, under Windows, you can use e.g. Git Bash for.)

cd $repositoryPath/.git/hooks
touch pre-push
chmod +x pre-push

And it contains:

#!/bin/sh
 
while read localRef localHash remoteRef remoteHash; do
    if [[ "$remoteRef" == "refs/heads/main" ]]; then
        for commit in $(git rev-list $remoteHash..$localHash); do
            if git grep -n "// REMOVE_ME" $commit; then
                echo "REJECTED: Commit contains REMOVE_ME tag!"
                exit 1
            fi
        done
    fi
done

This already will then lead git push to fail with output like

2f5da72ae9fd85bb5d64c03171c9a8f248b4865f:src/DevelopmentStuff.js:65:        // REMOVE_ME: temporary override for database URL
REJECTED: Commit contains REMOVE_ME tag!
failed to push some refs to '...'

and if that REMOVE_ME is removed, the push goes through.

Some comments:

You can easily extend this to multiple branches with the regex condition:
if [[ "$remoteRef" =~ /(main|release|whatever)$ ]];
The git grep -n flag is there for printing out the offending line number.
You can make this more convenient with git grep -En "//\s*REMOVE_ME", i.e. allowing an arbitrary number of whitespace between the // and the tag.
The surrounding loop structure:
while read localRef localHash remoteRef remoteHash;
is exactly matching the way git is processing this pre-push hook. Each hook has a while read <arg list> structure, but the specific arguments depend on the actual hook type.

Hope this can help you taking some extra care!

However, remember that for these local hooks, every developer has to setup them for themselves; they are not pushed to the server instance – but implementing a pre-receive hook there is the topic for a future blog post.

Customizing Vite plugins is as quick as Vite itself

Once in a blue moon or so, it might be worthwhile to look at the less frequently used features of the Vite bundler, just to know how your life could be made easier when writing web applications.

And there are real use cases to think about custom vite plugins, e.g.

wanting to treat SVGs as <svg> code or React Component, not just as a <img> resource – so maye your customer can easily swap them out as separate files. That is a solved problem with existing plugins like vite-plugin-svgr or vite-svg-loader, but… once in a ultra-violet moon or so… even the existing plugins might not suffice.
For teaching a few lectures about WebGL / GLSL shader programming, I wanted to readily show the results of changes in a fragment shader on the fly. That is the case in point:

I figured I could just use Vite’s Hot Module Replacement to reload the UI after changing the shader. There could have been alternatives like

Copying pieces of GLSL into my JS code as strings
– which is cumbersome,
Using the async import() syntax of JS
– which is then async, obviously,
Employing a working editor component on the UI like Ace / React-Ace
– which is nice when it works, but is so far off my actual quest, and I guess commonplace IDEs are still more convenient for editing GLSL

I wanted the changes to be quick (like, pair-programming-quick), and Vite’s name exactly means that, and their HMR aptly so. It also gives you the option of raw string assets (import Stuff from "./stuff.txt?raw";) which is ok, but I wanted a bit of prettification to be done automatically. I found vite-plugin-glsl, but I needed it customized because I wanted to always combine multiple blank lines to a single one and this is how easy it was:

./plugin/glslImport.js
Note: this is executed by the vite dev server, not our JS app itself.

import glsl from "vite-plugin-glsl";

const originalPlugin = glsl({compress: false});

const glslImport = () => ({
    ...originalPlugin,
    enforce: "pre",
    name: "vite-plugin-glsl-custom",
    transform: async (src, id) => {
        const original = await originalPlugin.transform(src, id);
        if (!original) { // not a shader source
            return;
        }
        // custom transformation as described above:
        const code = original.code
            .replace(/\\r\\n/g, "\\n")
            .replace(/\\n(\\n|\\r|\s)*?(?=\s*\w)/g, "\\n$1");
        return {...original, code};
    },
});

export default glslImport;

and then the vite.config.js is simply

import { defineConfig } from 'vite';
import glslImport from './plugin/glslImport.js';

export default defineConfig({
    plugins: [
        glslImport()
    ],
    ...
})

I liked that. It kept me from transforming either the raw import or that of the original plugin in each of the 20 different files, and I could easily fiddle around in my filesystem while my students only saw the somewhat-cleaned-up shader code.

So if you would ever have some non-typical-JS files that need some transformation but are e.g. too many or too volatile to be cultivated in their respective source format, that is a nice tool to know. That is as easily-pluggable-into-other-projects as a plugin should be.

Forking an Open Source Repository in Good Faith

One might love Open Source for different reasons: Maybe as a philosophical concept of transcendental sharing and human progress, maybe for reasons of transparency and security, maybe for the sole reason of getting stuff for free…

But, as a developer, Open Source is additionally appealing for the sake of actively participating, learning and sharing on a directly personal level.

Now I would guess that most repository forks are probably done for rather practical reasons (“I wanna have that!”), the forks get some minor patches one happens to need right now – or for some super-specific use case – and then hang around for some time until that use case vanishes or the changes are so vast that there will never be a merge (a situation commonly known as “der Zug ist abgefahren”), one might sometimes try to supply one’s work for the good of more than oneself. That is what I hereby declare a “Fork in Good Faith.”

A fork can happen in good faith if some conditions are true, like:

I am sure that someone else can benefit from my work
My technical skills match the technical level of the repository in question
Said upstream repository is even open for contributions (i.e. not understaffed)
My broader vision does not diverge from the original maintainers’ vision

Maybe there are more of these, but the most essential point is a mindset:

I declare to myself that I want to stay compatible with the upstream as long as is possible from both sides.

To fork with Good Faith is, then, a great idea because it helps to advance much more causes at once than just the stuff for free, i.e. on a developmental level:

You learn from the existing code, i.e. the language, coding style, design patterns, specific solutions, algorithms, hidden gems, …
You learn from the existing repository, i.e. how commits and branches are organized, how commit messages are used productively, how to manage patches or changes in general, …
In reverse, the original maintainers might learn from you, or at least future contributors might
You might get more people to actually see / try / use your awesome feature, thus getting more feedback or bug reports than brewing your own soup
You might consider it as a workout of professional confidence, to advocate your use cases or implementation decisions against other developers, training to focus on rational principles and unlearning the reflexes of your ego.
This can also serve as a workout in mental fluidity, by changing between different coding styles or conventions – if you are e.g. used of your super-perfect-one-and-only way of doing things, it might just positively blow your mind to see that other conventions can work too, if done properly.
Having someone to actually review your changes in a public pull request (merge request) gives you feedback also on an organisational level, as in “was all of this part actually important for your feature?”, “can you put that into a future pull request?” or “why did you rewrite all comments for some Paleo-Siberian language??”

Not to forget, you might grow your personal or professional network to some degree, or at least get the occasional thank you from anyone (well…).

But the basic point of this post is this:

Maintaining a Fork in Good Faith is active, continuous work.

And there is no shame in abandoning that claim, but if you do once, there might be no easy return.

Just think about the pure sadness of features that are sometimes replicated over-and-over again, or get lost over the time;

And just think about how confusing or annoying that already could have been for yourself, e.g. with some multiply-forked npm package or maybe full-fledged end-user projects (… how many forks of e.g. WLED do even exist?).

This is just some reflection of how careful such a decision should be done. Of course, I am writing this because I recently became aware of that point of bifurcation, i.e. not the point where a repository is forked, but the one where all of the advantages mentioned above are weighed against real downsides.

And these might be legitimate, and numerous, too. Just to name a few,

Maybe the existing conventions are just not “done properly”, and following them for the sake of uniformity makes you unproductive over time?
Maybe the original maintainers are just understaffed, non-responsive or do not adhere to a style of communication that works with you?
Maybe most discussions are really just debates of varying opinion (publicly, over the internet – that usually works!) and not vehicles of transcending the personal boundaries of human knowledge after all?
Maybe you are stuck with sub-par legacy code, unable to boy-scout away some technical debt because “that is not the point right now”, or maybe every other day some upstream commit flushes in more freshly baked legacy code?
Maybe no one understands your use case and contrary to the idea mentioned above – in order to get appropriate feedback about your features, and to prove its worth, you need to distribute this independently?
Maybe at one point the maintainers of an upstream repository change, and from now on you have to name your variables in some Paleo-Siberian language?

I guess you get the point by now. There is much energy to be saved by never considering upstream compatibility in the first place, but there is also much potential to be wasted. I have no clear answer – yet – how to draw the line, but maybe you have some insight on that topic, too.

Are there any examples of forks that live on their own, still with the occasional cherry-pick, rebase, merge? Not one comes to my mind.

Useful browser features for the common Web Dev

Once every while, I stumble upon some minor thing ingrained in modern browsers that make some specific problem so much easier. And while the usual Developer Tools offer tons of capabilities, these are so wildly spread and grouped that you easily get used to just ignoring most of them.

So this bunch of things come to my mind that seem not to be super-common knowledge. Maybe you benefit from some of it.

Disclaimer: The browsers I know of have their Dev Tools available via pressing F12 and a somewhat similar layout, even though particular words will differ. My naming here relates to the current Chrome, so your experience might differ.

Disabling the Browser cache

Your Browser caches many resources because the standard user usually prefers speed, is used that the browser endlessly hogs memory anyway, and most resources usually do not change that often anyway.

During development, this might lead to confusion because of course, you do change your sources often (in fact, that is your job), and the browser needs to know that fact.

For that, let it be known:

The "Network Tab" has a Checkbox "Disable Cache".
And the Dev Tools have to be open for it to work.

This is usually so much the default setting on my working machines that I need to remember myself on it when I troubleshoot something on someone else’s machine. Spread the word.

Also, browsers have a Hard-Reload-Feature, like under Chrome, to cleaning the Cache before the reload without having to do so for the whole browser.

Hard-Reload: Ctrl + Shift + R

I’ve read that this is also Chrome’s behaviour when pressing F5 while the Dev Tools are open, but anyway. Take extra care that your customer does not of that feature, because sometimes they might be frustrated about a failed update of your app that is actually just a cached version.

Inspect Network Requests

Depending on the complexity of your web app, or what external dependencies you are importing, it might be that the particular order and content of network requests is not trivial anymore.

Now the “Network” tab (you know it already) is interesting on its own, but remember that it only displays the requests since opening the Dev Tools, i.e. some Page Reload (hard or soft) might be required.

And – this appears somewhat changing between browser updates, don’t ask me why that is necessary – this is good to know:

The first column of that list shows only the last part of each request URL, but if you click on it, a very helpful window appears with details of the Request Headers, Payload and Response
make sure that in the filters above, the “Fetch/XHR” is active
And then some day I found, that

Right-Clicking a request gives you options for Copy, e.g. Copy as fetch, to so you can repeat the exact request from javascript-code for debugging etc.

Inspecting volatile elements

The “Elements” tab (called “Inspector” in other browsers) is quite straightforward to find mistakes in styling or the rendered HTML itself – you see the whole current HTML structure, click your element in there and to the right, from bottom to top you see the CSS rules applied.

But sometimes, it can be hard to inspect elements that change on mouse interaction, and it is good to know that (again, this holds for Chrome), first,

There is a shortcut like Ctrl + Shift + C to get into the Inspector without extra mouse movements

Now think of a popup that vanishes when hovering away. You might do it by using Ctrl + Shift + C to open the Inspector, then using the Keyboard to navigate within (especially with the Tab and Cursor keys), but here’s a small trick I thought of that helped me a lot:

Add (temporarily) an on-click-handler to your popup that calls window.alert(...);

With that, you can open your popup, press Ctrl + Shift + C, then click the popup, and the alert() will now block any events and you can use your mouse to inspect all you want.

In easer cases you could just disable the code that makes the popup go away, but in that case I had, this wasn’t an option either.

Now that I think of it, I could also have used debugger; instead of the alert(), but the point is that you have to block javascript execution only after interacting with the popup element.

The performance API

I have no idea why I discovered that only recently, but if precision in timing is important – e.g. measuring the impact of a possible change on performance, one does not have to resort to Date.now() with its millisecond resolution, but

there is performance.now() to give you microsecond precision for time measurements.

It can afford that by not having the epoch “zero” of Jan 1th 1970 as reference, and instead, the moment of starting your page.

The Performance API has a lot more stuff – which I didn’t need yet.

A FPS Monitor, and a whole world of hidden features

If you do graphics programming or live data visualization of some measurement, it might be of interest to see that. There’s a whole hidden menu at least in Chrome, and you can access it by focussing the Dev Tools, then

press Ctrl + Shift + P
Enter “FPS” and select
Now you have a nice overlay over your page.

Even if you are not in the target group of that specific feature, it might be interesting to know that there is a whole menu of many particular (sometimes experimental) features built into all these browsers. They differ from browser to browser, and version to version, and of course, plugins can do a lot more,

but it might just be worth the idea to think that maybe your work flow can benefit from any of that stuff.

How React components can know their actual dimensions

Every once in a while, styling a Web Application can be ~~oh so frustrating~~ quite interesting because stuff that appears easy does actually not comply with any of your suggestions. And there are some fields that ambush me more often than I’d like to admit, and with each application there appears some unique quirk that makes a universal solution hard.

Right now, I’m thinking about a CSS-only nested layout of several areas on your available screen that need to make good use of the available space, but still be somewhat dynamic in order to be maintainable.

Web Apps are especially delicate in layout things because if you ask most customers, a fully responsive layout is never the goal (as in, way too expensive for their use case), but no matter how often you make them assure you that there is only a small set of target resolutions, there will be one day where something changed and well-yeah-these-ones-too-of-course.

It is also commonly encountered that “looking good” is “not that important”, but as progress goes, everyone still knows that that was a pure lie.

Of course, this is a manifestation of Feature Creep, but one that is hard to argue about. And we do not want to argue with customers anyway, we want to solve their problems with as little friction as possible.

So by now, one would have thought that CSS would have evolved quite enough in order to at least place dynamic content somewhat predictable. There are flexbox and grid displays and these are useful as hell, but still.

And while, for some reason or another, the width of dynamic nested content can usually be accounted for in some pure CSS solution that one can find in under a day’s work; getting the height quite right is a problem that is officially harder than all multi-order corrections I ever encountered in my studies of quantum field theory. Only solvable in some oversimplified use-cases.

The limits of “height: 100%;” are reached in cases where content is dominated by its content instead of their container; as in nested <svg> elements that love to disagree about the meaning of “100%”. Dynamic SVG content is especially more cumbersome because you neither want distorted nor cut-off content, and you can try to get along with viewBox and preserveAspectRatio, but even then.

Maybe it won’t budge, and maybe that’s the point where I find it acceptable to read the actual DOM elements even from within a React component, an approach that is usually as dangerous as it is intrusive,

but is it a code smell if it is rather concise and reliantly does the job?

const useHeightAwareRef = () => {
    const [height, setHeight] = useState({
        initialized: false,
        value: null,
    });
    const ref = useRef(null);

    useEffect(() => {
        if (!ref.current || height.initialized) {
            return;
        }

        const adjustHeight = () => {
            const rect = ref.current?.getBoundingClientRect();
            setHeight({
                value: rect?.height ?? null,
                initialized: true
            });
        };

        adjustHeight();
        window.addEventListener("resize", adjustHeight);
        return () => {
            window.removeEventListener("resize", adjustHeight);
        };
    }, [height.initialized]);

    return {
        height: height.initialized ? height.value : null,
        ref
    };
};

// then use this like:

const SomeNestedContent = () => {
    const {height, ref} = useHeightAwareRef();

    return (
        <div ref={ref}>
        {
            height &&
            <svg height={height} width={"100%"}>
                { /* ... Dragons be here ... */ }
            </svg>
        }
        </div>
    );
};

I find this worthfile to have in your toolbox. If you manage your super-dynamic* content in some other super-responsive** fashion in a way that is super-arguable*** to your customer, sure, go by it. But remember, at some point each, possibly,

(*) your customer might have data outside the mutually agreed use cases,
(**) your customer might have screens outside the mutually agreed ones,
(***) your customer might have less patience / time than originally intended,

so maybe move the idea of “there must be one super-elegant pure-CSS solution in the year of 2025” back into your dreams and shoehorn that <svg> & friends into where they belong :´)

	Anonymous on Cache configuration with WildF…
	Miq on Nested queries like N+1 in pra…
	mariuselvert on Creating functors with lambda…
	Nested queries like… on Common SQL Performance Gotchas…
	Nested queries like… on Make your users happy by not c…