Using custom Docker containers for development with WebStorm & Co.

Docker has become one of the go-to tools of many developers these days. Not because any project should implement as many technological buzz words per se, but due to their great deal of flexibility compared with their small hassle of setup.

For stuff like node-based applications, using a Dev Container is useful because in principle, you do not need to have any of the npm stuff on your actual machine – not only you avoid having these monstrous node_modules folders, but also avoid having accidental dependencies on some specific configuration that might hold true on your device, but not generally.

For some of these reasons probably, JetBrains included Docker Dev Containers as a kind of “remote” development. In a sense, a docker container can be thought of as a remote machine, regardless of the fact that it shares your local hardware and is just a software abstraction.

In my opinion, JetBrains usually does great software, but there is some weird behaviour in their usage of Docker Dev Containers and it took us a while to find a quite general and IDE-independent solution; I’ll just use WebStorm as an example of something that appeared unusually hard to tame. I guess it will become better eventually.

For now, one might think of using the built-in config like:

  1. New Run Configuration -> npm
  2. Node Interpreter: “…”
  3. “+” -> Add Remote… -> “Docker”
  4. Use an image of your choice, either one of the node base images or a custom one (see below) with its corresponding tag

Now for reasons that seem to be completely undocumented and unavoidable (tell me if you know more!), the IDE forces you to then mount your project to /opt/project inside this container, where it gets mirrored during runtime to somewhere /tmp/<temporary uuid>/ – and in several of our projects (due to our folder structure which is not even particularly abnormal) this made this option to be completely unusable.

The way one can work without these strange idiosyncrasies is as follows:

First, create a Dockerfile in which you do all the required setup. It might be an optional idea to set the user, away from “root” to something more restricted like “node” (even though in development, you probably have your eyes on everything nevertheless). You can do more custom setup here. This can look like

FROM node:16.18.0-bullseye-slim

WORKDIR /your-home-inside-container
RUN chown node .

COPY package.json package-lock.json /your-home-inside-container

USER node

RUN npm ci --ignore-scripts

# COPY <whatever you might want> <where you want it inside>

EXPOSE 3000

CMD npm start

From that Dockerfile, build a local image in the same folder like:

# you might need -f if the Dockerfile is not named "Dockerfile"
docker build -t your-dev-image .

Then, create a new Run Configuration but choose “Shell script” (not npm)

docker run -it --rm --entrypoint= -v ${PWD}/src:/your-home-inside-container/src -p 0.0.0.0:3000:3000 your-dev-image

You might use a different “-p” port forwarding if you do not want to have your development server broadcasting on port 3000 (another advantage of Dev Containers, you can easily run multiple instances on different ports).

This is about the whole magic. But there are two further things that could be important here:

Hot Reloading (live updating whenever source files change)

This is done rather easily, however seems to change once in a while. We figured out that at least if you are using react-scripts@5.0.1 (which is what “npm start” addresses, unless you do that differently), you just need to set the environment variable “WATCHPACK_POLLING=true”. I.e put that in your Dockerfile a

ENV WATCHPACK_POLLING true

or pass it into your docker run ... -e WATCHPACK_POLLING=true ... your-dev-image line

Routing a development proxy to some “local host”

If your software e.g. adresses a backend that is running on your development machine or another Docker Dev Container, it can not just access that host from inside the Docker container. Neither is the port forwarding via “-p …:…” of any use, because that addresses the other direction – i.e. what port from the container is exposed to outside access – here, we go the other direction.

When the software inside the container would actually want to address “localhost”, it needs to be directed at the host under which your local machine appears. Docker has a special hostname for that and it is host.docker.internal

I.e. if your local backend is running on “localhost:8080” on your machine, you need to tell your Dev Container to direct its requests to “host.docker.internal:8080”.

In one of our projects, we needed some specific control over the proxy that the React development server gives you and here is way to gain that control – add a “setupProxy.js” inside your src/ folder and put in it something like

const { createProxyMiddleware } = require('http-proxy-middleware');

module.exports = function(app) {
    if (process.env.LOCAL_DEVELOPMENT) {
        return;
    }

    let httpProxyMiddleware = createProxyMiddleware({
        target: process.env.REACT_APP_PROXY || 'http://localhost:8080',
        changeOrigin: true,
    });
    app.use('/api', httpProxyMiddleware); // change to your needs accordingly
};

This way, one can always change the address via setting a REACT_APP_PROXY environment variable as in the step above; and one can also disable the whole proxying by setting the LOCAL_DEVELOPMENT env variable to true. Name these as you like, and you can even extend this setupProxy to include web sockets or different proxies for different routes, if you have any questions on that, just comment below 🙂

Web Security for Frontend and Backend

The web is everywhere and we use it for tons of important tasks like online banking, shopping and communication. So it becomes increasingly important to implement proper security. As attacks like cross-site scripting (XSS) or cross-site request forgery (CSRF) are wide-spread browsers, web standards designers and web application developers implement more and more mechanisms to make such attacks harder or even impossible. This puts a certain burden on both frontend and backend developers.

Since security is hard and should not be an afterthought I would like to give you some advice when implementing a web app using a Javascript-frontend and a backend service written in some of the common languages/frameworks like .NET, Micronaut, Javalin, Flask or the like.

Frontend advice

I prefer traditional cookie-based sessions to JWT-based approaches for interactive web frontends because of simplicity, browser support and the possibility to use it without Javascript. For service-to-service communication bearer tokens of some kind may be more appropriate. Your Javascript client has to include the credentials in the fetch() calls to cause the browser to send the cookie.

Unfortunately, incorrect use of cookies may be insecure, so be sure to check up-to-date advice on cookies; see some hints below in the backend part because cookies are configured and issued there.

Backend advice

Modern web security requires additional measures on the server side to ensure secure authentication and communication with web clients. You should use https whereever possible to gain at least transport security and avoid many cases of sniffing credentials or changing content between client and backend.

Improving security of cookies

First of all, cookies should be HttpOnly so that scripts cannot access the contents of a cookie. Furthermore you should ideally set the SameSite and Secure attributes appropriately and use https whenever possible. That way you have mitigated the most common attacks on your session handling and authentication.

Another bonus for cookies is that browsers can inform you about problems with your cookie setup:

Configuring Cross-Origin Resource Sharing (CORS)

Nowadays it is common for web app to be served from a different host than the backend API. This is a potential problem because attackers may sneak scripts into the browser of a user and use the existing session to access the resources in an illegal way. Therefore another means of improving security of web apps running in browsers was introduced with the access control using CORS.

For browsers to be able to prevent or allow requests to certain resources the backend has to provide appropriate Access-Control-headers, most notably Access-Control-Allow-Origin and Access-Control-Allow-Credentials. Make sure to set these values correctly or your frontend will have trouble to access your backend or you introduce a potential security whole.

Fortunately many web frameworks make it easy to configure CORS, see Micronaut documentation for example.

Conclusion

Security is always important and browser vendors keep implementing additional measures to mitigate problems in the current web environment. Make sure you keep up with the latest advice and measures and implement them in your applications.

SQLite in ASP.NET 6.0: Access your database file via HTTP Endpoint

It is one of our fundamental principles to always choose the most-easy-while-capable tool for a job. For this, we try not to shower our customers with the newest, most hip technology available, but to use a technology stack we are

  • comfortable with
  • quick to provide the required minimum of customer value
  • keeping enough options open in order anything changes

One of the heavily affected aspects in that regard is the choice of data storage. There are a lot of different design paradigms one can choose from, but with the “most easy” aspect at hand, the question mostly resolves around the needs of the customer, not the wants (or “might be useful one day”) of the developer.

If your customer already has their PostgreSQL databases distributed in their Kubernetes as an example, it might be advisable to aim for that. If the customer does not have any integrated structure yet, I start with the question:

Is anything more necessary than a single-file database?

For one of our ASP.NET 6.0 applications, this was answered with the choice of Sqlite, due to it being native to the Microsoft universe including Entity Framework, which has many common use cases already answered, i.e. gives you way of caring about your application logic more than their database abstractions.

(It might be said that for .NET, an interesting project seems to have been LiteDB, which also operates on a single database file, but at the time of this writing, seems to have gone stale in development / support, and therefore fell out of my favour soon. Sad.).

Now we have a project in which we are closely in touch with the customer and their live system, very often had it been useful to access their platform and take a snapshot of the database for backup or assurance of our logic, and with the technical overhead in that specific case (which required several steps of sequentially granting remote access), I thought myself:

Why can’t I have a (sufficiently secured) HTTP endpoint that gives me this SQLite file as a File download?

The solution was a bit tricky because either the file was not read-accessible during that HTTP request (having been open already), the filestream was not possible because it was being closed too early, or the encoding of the resulting file would not fit. What finally worked was:

        private readonly static System.Text.Encoding enc1252 =
            CodePagesEncodingProvider.Instance.GetEncoding(1252);

        [HttpGet("database")]
        public ActionResult GetDatabase()
        {
            var dataSource = "sqlite.db";
            if (!System.IO.File.Exists(dataSource))
            {
                return NotFound(dataSource);
            }
            db.SaveChanges();
            // Note: CloseConnection() was not required!

            using var fs = new FileStream(dataSource, FileMode.Open, FileAccess.Read, FileShare.ReadWrite);
            var reader = new StreamReader(fs, enc1252);
            var data = reader.ReadToEnd();
            var ms = new MemoryStream(enc1252.GetBytes(data));
            return new FileStreamResult(ms, "application/octet-stream");
        }

Feel free to comment on that way because I found it more than none-trivial to arrive there, but maybe I missed something obvious. Some definite stumbling stones definitively were in

  • The Mime Type “application/octet-stream”, which for some reason would not work with the more adequately sounding choice “application/x-sqlite3” – I have no idea why.
  • The Encoding, which on our system was the Windows CodePages-1252 default, which needed to be specified not only in the interpretation of our bytes stream (second location), but also in the definition of the StreamReader itself (first location).
  • Please note that if your database is encoded via CP-1252, you also need the System.Text.Encoding.CodePages package (available via NuGet)
  • What looks like a missing “using”, is really intentional: If the StreamReader was opened with “using var reader = …”, it had the effect of being disposed before the request was handled correctly – I ran into an error of FileStreamResult: “Cannot access a closed Stream.” – keeping the StreamReader open solved that and the internet told me that this is still not a memory leak; the StreamReader reader gets disposed when the FileStream fs is disposed (see the using in front of that), but it still feels weird.

If you have any comments on that, I’d be very glad to learn from them, but if you don’t and you just have another use case for that problem – I’m happy to help!

Don’t just useCallback() with higher-order-functions

This is a small thing that once took me longer to debug than necessary, which is why it might be useful to some of you out there.

From time to time, we have that situation in a React application where it’s just not really avoidable that a small component has to accomplish a rather expensive computation. That’s what memoization is for, i.e. reusing the results of old computations when we know that these are still applicable.

React, in its functional approach, has three ways of memoiziating things, and for whole components there is React.memo(), while for usage inside a component we have the hooks React.useMemo() most commonly used for values or value-like objects, and React.useCallback() for functions. Because JavaScript is quite a functional languare, there is a rough equivalence between the latter two – but now I’m here to look into that.

// rather trivial function – these are equal React.useMemo(() => () => x, [x]); React.useCallback(() => x, [x]); // higher-order function – they are not! React.useMemo(() => higherOrderFunction(x), [x]); React.useCallback(higherOrderFunction(x), [x]);

There are various such higher-order components that are avilable for developers to use re-existing logic. One such case is debouncing, i.e. when you expect state changes to sometimes come in very large batches, the most common case probably a <input/> field whose value is triggering a server request or something like that. Other common cases would be drag’n’drop interactions or window resizing.

With a useRef(), one can rather easily write such debouncing oneself (google it or ask in the comments), but there is lodash.debounce which take care of that with such a higher-component function.

const MILLISEC = 500;

const Component = () => {
  const [value, setValue] = React.useState("");

  const handle = React.useMemo(() => debounce(event => { ... }, MILLISEC), []);

  return <input onChange={handle} value={value}/>;
};

Now I don’t want to talk about the specific case of debounce() (but one can look at the source code to guess its doing), this is just an example. Third-party logic is helpful when not-reinventing-the-wheel, but you can’t be that sure about computational costs, especially when some of your dependencies might update in the future – so that might be a good point to use memoization without actually seeing the benefit in the time of developing. (*)

As Dmitir Pavlutin here states nicely for that specific case, you can not juse write useCallback(debounce(...), []) here in place of useMemo. It is rather trivial but you need to take care: The JavaScript engine will have no other option than to execute the debounce() on creation of the callback, it can not know that this is something to be evaluated later.

Anything that is not an arrow function () => { ... } or an old-school function() { ... } will be evaluated when the corresponding line is reached. The syntax does not allow anything to be wrapped around it in order to delay that execution to the first call.

So. Debounce might not be the most expensive thing, and in general one might not even need memoization, but if you do – always remember that something has to be a function in order for any of that to work.

(*) This is not a call for premature optimization.

It cannot be stressed enough that one shouldn’t wrap every single computation into a memoization in either case. Sure, one should care about useless computations as stated above, but always know that the memo thing itself is not free. So when in doubt, think about how to quantify your specific gain, e.g. via the React DevTools Profiler, the performance API or at least logging of Date.now() timestamps.

Also, only think about performance when doing so. If there is any case of “my application actually behaves differently” when using useMemo / useCallback, this is a red flag – drop the thought of optimization instantly and care about your overall architecture first.

LDAP-Authentication in Wildfly (Elytron)

Authentication is never really easy to get right but it is important. So there are plenty of frameworks out there to facilitate authentication for developers.

The current installment of the authentication system in Wildfly/JEE7 right now is called Elytron which makes using different authentication backends mostly a matter of configuration. This configuration however is quite extensive and consists of several entities due to its flexiblity. Some may even say it is over-engineered…

Therefore I want to provide some kind of a walkthrough of how to get authentication up and running in Wildfly elytron by using a LDAP user store as the backend.

Our aim is to configure the authentication with a LDAP backend, to implement login/logout and to secure our application endpoints using annotations.

Setup

Of course you need to install a relatively modern Wildfly JEE server, I used Wildfly 26. For your credential store and authentication backend you may setup a containerized Samba server, like I showed in a previous blog post.

Configuration of security realms, domains etc.

We have four major components we need to configure to use the elytron security subsystem of Wildfly:

  • The security domain defines the realms to use for authentication. That way you can authenticate against several different realms
  • The security realms define how to use the identity store and how to map groups to security roles
  • The dir-context defines the connection to the identity store – in our case the LDAP server.
  • The application security domain associates deployments (aka applications) with a security domain.

So let us put all that together in a sample configuration:

<subsystem xmlns="urn:wildfly:elytron:15.0" final-providers="combined-providers" disallowed-providers="OracleUcrypto">
    ...
    <security-domains>
        <security-domain name="DevLdapDomain" default-realm="AuthRealm" permission-mapper="default-permission-mapper">
            <realm name="AuthRealm" role-decoder="groups-to-roles"/>
        </security-domain>
    </security-domains>
    <security-realms>
        ...
        <ldap-realm name="LdapRealm" dir-context="ldap-connection" direct-verification="true">
            <identity-mapping rdn-identifier="CN" search-base-dn="CN=Users,DC=ldap,DC=schneide,DC=dev">
                <attribute-mapping>
                    <attribute from="cn" to="Roles" filter="(member={1})" filter-base-dn="CN=Users,DC=ldap,DC=schneide,DC=dev"/>
                </attribute-mapping>
            </identity-mapping>
        </ldap-realm>
        <ldap-realm name="OtherLdapRealm" dir-context="ldap-connection" direct-verification="true">
            <identity-mapping rdn-identifier="CN" search-base-dn="CN=OtherUsers,DC=ldap,DC=schneide,DC=dev">
                <attribute-mapping>
                    <attribute from="cn" to="Roles" filter="(member={1})" filter-base-dn="CN=auth,DC=ldap,DC=schneide,DC=dev"/>
                </attribute-mapping>
            </identity-mapping>
        </ldap-realm>
        <distributed-realm name="AuthRealm" realms="LdapRealm OtherLdapRealm"/>
    </security-realms>
    <dir-contexts>
        <dir-context name="ldap-connection" url="ldap://ldap.schneide.dev:389" principal="CN=Administrator,CN=Users,DC=ldap,DC=schneide,DC=dev">
            <credential-reference clear-text="admin123!"/>
        </dir-context>
    </dir-contexts>
</subsystem>
<subsystem xmlns="urn:jboss:domain:undertow:12.0" default-server="default-server" default-virtual-host="default-host" default-servlet-container="default" default-security-domain="DevLdapDomain" statistics-enabled="true">
    ...
    <application-security-domains>
        <application-security-domain name="myapp" security-domain="DevLdapDomain"/>
    </application-security-domains>
</subsystem>

In the above configuration we have two security realms using the same identity store to allow authenticating users in separate subtrees of our LDAP directory. That way we do not need to search the whole directory and authentication becomes much faster.

Note: You may not need to do something like that if all your users reside in the same subtree.

The example shows a simple, but non-trivial use case that justifies the complexity of the involved entities.

Implementing login functionality using the Framework

Logging users in, using their session and logging them out again is almost trivial after all is set up correctly. Essentially you use HttpServletRequest.login(username, password), HttpServletRequest.getSession() , HttpServletRequest.isUserInRole(role) and HttpServletRequest.logout() to manage your authentication needs.

That way you can check for active session and the roles of the current user when handling requests. In addition to the imperative way with isUserInRole() we can secure endpoints declaratively as shown in the last section.

Declarative access control

In addition to fine grained imperative access control using the methods on HttpServletRequest we can use annotations to secure our endpoints and to make sure that only authenticated users with certain roles may access the endpoint. See the following example:

@WebServlet(urlPatterns = ["/*"], name = "MyApp endpoint")
@ServletSecurity(
    HttpConstraint(
        transportGuarantee = ServletSecurity.TransportGuarantee.NONE,
        rolesAllowed = ["oridnary_user", "super_admin"],
    )
)
public class MyAppEndpoint extends HttpServlet {
...
}

To allow unauthenticated access you can use the value attribute instead of rolesAllowed in the HttpConstraint:

@ServletSecurity(
    HttpConstraint(
        transportGuarantee = ServletSecurity.TransportGuarantee.NONE,
        value = ServletSecurity.EmptyRoleSemantic.PERMIT)
)

I hope all of the above helps to setup simple and secure authentication and authorization in Wildfly/JEE.

5 Not-so-Beginner’s React Pitfalls

React, in my opinion, has become quite a useful tool over the years. I admin I haven’t given the other major frameworks a try, but from the look of the resulting code, I only would give Svelte a real chance in the nearer future (in fact, you’d really have to pay me real big money to convince me about Angular).

Now with many of the more useful JS libraries, React is in a state where not only has it survived quite a time (reaching v18 only a few weeks ago), but also breeding a community that harbors a lot of valuable knowledge, enabling one to efecavoid the most common pitfalls at the beginning of your journey. There are lots of resources you can easily find online, from few-hour-courses to several posts in other blogs about the most common traps.

However, in our daily life it appears that there still are some very good points to make about how not to go about React’s unopinionatedness. So these are some of our own findings that I’ve not yet seen overly emphasized, and maybe they are here for your advantage.

1. HAVE YOUR STATES ATOMIC

It might happen that one migrates an older React component where functional programming wasn’t the norm yet, or out of whatever habit, that you declares something like a greedy React state as

const [state, setState] = useState({this: ..., that: ... , ..., ...});

Now your state profits much from immutability (think of this as “your machine then knows that it’s content is clear and unique, given any time”) and therefore you do not need to care about the same-or-not-sameness of state.that when evaluating state.this. Therefore, it is usually advised to split that up into several independent states as

const [this, setThis] = useState(...);
const [that, setThat] = useState(...);
...

That is more readable and everything. However, the most useful rule to build your states is not even to split everything up as small-as-possible, but rather, to have your states atomic. By that, we mean, “not needlessly large, but containing all what might change at the same time”.

One common example is basic data fetching. If you don’t choose to grab for react-query, which I personally like. But if you do e.g. a simple GET request, you usually do not only have “data” (some response), but also at least a “pending” (has the request finished yet?) and an “error” (is this response even usable?) field. These all change at the same time. Thus, they belong to the same entity. That state, designed atomically

const [query, setQuery] = useState({
    pending: false,
    data: null,
    error: null,
});

side note: you might choose not to use the null object as an initial value here because of the known problem of ambivalence with this object. For this illustration, it will suffice.

So, this query state now is atomic. Not to split further without serious consequences, as you will. If you had another, unrelated query, you would not just put it right into the same state entity; but if you had another property of that query (like e.g. a separate field for the status code, …), it would belong.

This helps in having more predictable useEffect, useMemo etc. dependency arrays. You can have an Effect depending on [query] as a whole and this makes complete semantic sense. It would be very hard to predict it’s behaviour, if you mashed multiple queries or whatever-state-you-can-think-of in there.

2.HAVE YOUR EFFECTS ATOMIC & TEAR THEM DOWN

Similarly, it is not super obvious (to the newcomer’s eye at least), that you can have multiple useEffects(). You can adhere to the Single Responsibility principle right there — the only good Effects are the ones that you can grasp in a twinkling of an eye. Use one each for every single thing you want to achieve, don’t lump multiple different things together in a somewhat-“constructor”-type of thinking. This keeps the dependency arrays small and controllable, and there are fewer cases of peculiar “But this CANNOT EVEN happen!!”.

Moreover, Effects have a function designed to clean them up, or the teardown function. If your Effect starts any larger operation and then for some reason your component get’s re-rendered before your operation is finished, you are likely to get hit by that effect in a state where you forgot about it already. You can follow this example

// example: listening to the scroll event
useEffect(() => {
    const handler = (event) => { /* ... */ };
    document.addEventListener('scroll', handler);
    return () => document.removeEventListener('scroll', handler);
}, []);

// or you might do something later in life
useEffect(() => {
    const timeout = setTimeout(() => { /* ... */ }, 5000);
    return () => clearTimeout(timeout);
}, []);

Some asynchronous operations might not have a simple teardown operation, but you can at least tell your Promises to disregard the effect. This is at least responsible for the very ugly

Warning: Can’t perform a React state update on an unmounted component. This is a no-op, but it indicates a memory leak in your application.

If you are responsible, you clean your Browser Console of all of these warnings. It appears if you call a setState-or-similar function at a point where the teardown actually should have happened. This pattern solves that case:

// this example uses a fetch Promise,
// but it also works for stale setTimeout handlers etc.

useEffect(() => {
    let mounted = true;
    fetch('/whatever').then(() => {
        if (mounted) {
            setState(true);
        }
    };
    return () => { mounted = false };
}, []);

// if you do not check for the value of mounted,
// the "memory leak" error can appear, if the
// fetch returns when the component updated meanwhile.

Side note: I also can not recall a single case in which the common React linter rule “exhaustivedeps” was worth ignoring. I had several occasions in which I believed to outsmart the stupid machine, only to end up in much larger problems down the road. Sure, things like Redux’ dispatch() might be cumbersome to include always, but I found that if I just make sure that exhaustive-deps never fires, I am more happy in the long run.

3.USEEFFECT() in too DEEP Functions

Especially in the context of data fetching, it might appear luring to put your useEffect() calls as deep (in the direction of the smallest components) as you can. Even more so, if you don’t have a rigid way of state management.

Now, I feel the point that this appears as “more modular” and flexible, but for me, has happend to situations where way too many requests were sent to our backends. You trade the modularity for the unpredictability of some Effects, so the best way I came to think of it was: Treat useEffect() like a bug.

I’m not saying that using it is wrong. But if you are wary of it’s appearance, this can help. Sometimes, it is just possible to do everything an Effect does – just completely outside React. Maybe, the Effect code can instead live in your index.js (as vanilla JS or otherwise) and just injected into your Root component, e.g. as props or via other libraries. E.g. with a Redux middleware, some effects can run with a higher degree of control about your state.

Remember: Modularity is not bad per se. It’s good. Don’t elevate the most particular effects to the top level of your application, but figure out where they can live well enough so you exactly know when they need to fire.

So far, there hasn’t been a case where I wished that I stuffed my useEffects further down to the virtual DOM leaves, but several, in which elevating them helped me a lot.

4. USE CUSTOM HOOKS with minimal interface

I consider it helpful, even for React beginners, to always be on the lookout of what could be its own React hook. A React Hook is any function that has a name beginning with “use” and for the most time, these consist of some combination of internal useState, useEffect, useContext and useRef definitions.

But their merit is in that they allow for much cleaner, dumber looking Components themselves – consider: dumb components are the best!

If they are only needed once, you can have them co-located next to where they are needed, but even just the act of giving them an own name makes for much more understandable code.

I use custom hooks for a lot of things, e.g.

  • having a State that is persisted in the localStorage / sessionStorage
  • having a State that updates in a debounced / throttled / delayed manner
  • standardizing very basic data fetching
  • accessing the window width at any time (nice for Responsive layout)
  • creating a React ref for an element with an “clicked outside” handler
  • standardized response of messages from connected websockets

I will now spare you the code, but if you have questions about any of these, just drop a comment.

One important point, though: Always have your interface minimal. E.g. if your custom hook has an internal setState(), think hard about whether you pass that function to the outside via the hook return value. Even if you are the only developer on a project, treat yourself as two different instances, one “framework designer” and one “framework consumer”, and as the designer, think hard about what havoc the consumer could do if you allow him too much.

5. Do not duplicate STATE informAtion (especially with react-router)

This applies to every state information, but it’s important to recognize that your URL route is just that: a kind of global state. One that your user can edit directly at any time, leaving the synchronization up to you.

So do not go about it by reading the URL parameters into some state that has it’s own setState! If you define a certain role of a state parameter in your URL, then it is your obligation to have a uni-directional data flow:

  1. From the route, that value flows into your application in a clearly-defined manner,
  2. where you act upon it as you wish, until you need to change it
  3. Then you change the route. Then go back to 1

Of course, one might imagine that in some cases you can not guarantee that. Then maybe do your own synchronization logic, but I would highly advise you to stash that away into e.g. a custom hook, or middleware if you use Redux, so that you can test it thoroughly and it won’t break too soon.

Further note: There are situations where it is quite sensible to have two very similar states, if they have a different responsibility. These are not a bug.

E.g. if you GET a value from a server, then edit it in a controlled <input/> field, and PUT it to the server again, you do not wish to do so on every key press. Then these are not meant to be the same:

  1. the value as you currently know it from the server
  2. the value as it exists inside the <input/>

These are semantically different. They can and should be a different state entity. But if you have something that is utterly dependant on one other state, then chances are you do not really need another entity.

All in all,

that turned out longer than I envisioned it to be become. But I hope it is of any help to any React coders who managed the absolute basics and now are prone to the next-level pitfalls.

The good news is that after a certain bunch of hardships, there is rarely the case of even more surprises. So, manage your state and effects responsibly, especially the asynchronous ones, and the rest are practices that apply for any software development.

Or am I misled?

Serving static resources in Javalin running as servlets

Javalin is a nice JVM-based microframework targetted at web APIs supporting Java and Kotlin as implementation language. Usually, it uses Jetty and runs standalone on the server or in a container.

However, those who want or need to deploy it to a servlet container/application server like Tomcat or Wildfly can do so by only changing a few lines of code and annotating at least one Url as a @WebServlet. Most of your application will continue to run unchanged.

But why do I say only “most of your application”?

Unfortunately, Javalin-jetty and Javalin-standalone do not provide complete feature parity. One important example is serving static resources, especially, if you do not want to only provide an API backend service but also serve resources like a single-page-application (SPA) or an OpenAPI-generated web interface.

Serving static resources in Javalin-jetty

Serving static files is straightforward and super-simple if you are using Javalin-jetty. Just configure the Javalin app using config.addStaticFiles() to specify some paths and file locations and your are done.

The OpenAPI-plugin for Javalin uses the above mechanism to serve it’s web interface, too.

Serving static resources in Javalin-standalone

Javalin-standalone, which is used for deployment to application servers, does not support serving static files as this is a jetty feature and standalone is built to run without jetty. So the short answer is: you can not!

The longer answer is, that you can implement a workaround by writing a servlet based on Javalin-standalone to serve files from the classpath for certain Url-paths yourself. See below a sample implementation in Kotlin using Javalin-standalone to accomplish the task:

package com.schneide.demo

import io.javalin.Javalin
import io.javalin.http.Context
import io.javalin.http.HttpCode
import java.net.URLConnection
import javax.servlet.annotation.WebServlet
import javax.servlet.http.HttpServlet
import javax.servlet.http.HttpServletRequest
import javax.servlet.http.HttpServletResponse

private const val DEFAULT_CONTENT_TYPE = "text/plain"

@WebServlet(urlPatterns = ["/*"], name = "Static resources endpoints")
class StaticResourcesEndpoints : HttpServlet() {
    private val wellknownTextContentTypes = mapOf(
        "js" to "text/javascript",
        "css" to "text/css"
    )

    private val servlet = Javalin.createStandalone()
        .get("/") { context ->
            serveResource(context, "/public", "index.html")
        }
        .get("/*") { context ->
            serveResource(context, "/public")
        }
        .javalinServlet()!!

    private fun serveResource(context: Context, prefix: String, fileName: String = "") {
        val filePath = context.path().replace(context.contextPath(), prefix) + fileName
        val resource = javaClass.getResourceAsStream(filePath)
        if (resource == null) {
            context.status(HttpCode.NOT_FOUND).result(filePath)
            return
        }
        var mimeType = URLConnection.guessContentTypeFromName(filePath)
        if (mimeType == null) {
            mimeType = guessContentTypeForWellKnownTextFiles(filePath)
        }
        context.contentType(mimeType)
        context.result(resource)
    }

    private fun guessContentTypeForWellKnownTextFiles(filePath: String): String {
        if (filePath.indexOf(".") == -1) {
            return DEFAULT_CONTENT_TYPE
        }
        val extension = filePath.substring(filePath.lastIndexOf('.') + 1)
        return wellknownTextContentTypes.getOrDefault(extension, DEFAULT_CONTENT_TYPE)
    }

    override fun service(req: HttpServletRequest?, resp: HttpServletResponse?) {
        servlet.service(req, resp)
    }
}

The code performs 3 major tasks:

  1. Register a Javalin-standalone app as a WebServlet for certain URLs
  2. Load static files bundled in the WAR-file from defined locations
  3. Guess the content-type of the files as good as possible for the response

Feel free to use and modify the code in your project if you find it useful. I will try to get this workaround into Javalin-standalone if I find the time to improve feature-parity between Javalin-jetty and Javalin-standalone. Until then I hopy you find the code useful.

Mutable States can change inside your Browser console log

So we know, that web development must be one of the fastest-changing ecospheres humankind has ever seen (not to say, JavaScript frameworks and their best practices definitely mutate similar in frequency and deadliness as Coronaviruses). While these new developments can also come with great joy and many opportunities, this means that once in a while, we need to take care of older projects which were written in a completely different mindset.

It’s somehow trivial: Even when your infrastructure is prone to constant shifts, any Software Developer holding at least some reputation should strive to write their code as long-living and maintainable as originally intended. Or longer.

But once in a while you run into legacy code that you first have to dissect in order to understand their working. And for JS, this usually means inserting console.log() statements at various places and to trace them during execution (yeah, I know, there’s a plenitude of articles telling you to stop that, but let’s just stay at the most basic level here).

Especially in an architecture with distributed, possibly asynchronous events (which helps in reducing coupling, see e.g. Mediator and Publish-Subscribe patterns), this can help your bugtracing. But there’s a catch. One which took me some time to actually understand as quite the villain.

It does not make any sense to me, but for some reason, at least Chrome and Firefox in their current implementation save some effort when using console.log() for object entities. As in, they seem to just hold a reference for lazy evaluation. It can then be that you look upwards at your log, maybe even need to scroll there, look at some value and then not realize that you are looking at the current state, not the state at time of logging!

Maybe that was clear to you. Maybe it never occured to you because you always cared about using your state immutably. But in case you are developing on some legacy code and don’t know about what your predecessor did everywhere, you might not be prepared.

You can visualize that difference easily by yourself. Consider that short JS script:

var trustfulObject = {number: 0};
var deceptiveObject = {number: 0};

// let's just increase these numbers once each second
setInterval(() => {
    console.log("let's see...", trustfulObject, deceptiveObject);
    trustfulObject = {number: trustfulObject.number + 1};
    deceptiveObject.number = deceptiveObject.number + 1;
}, 1000);

Let that code run for a while and then open your Browser console. Scroll upwards a bit and click on some of the objects. You will find that the trustfulObject is always enumerated as supposed (at the time of logging), while the deceptiveObject will always show the number at the time of clicking. That surely surprised me.

In case you are still wondering why: The trustfulObject is freshly created each step and then reassigned to your reference variable. It seems the Browser has no other choice than logging the old (correct) state, because the reference is lost afterwards. The deceptiveObject holds the same reference during the whole runtime, which somehow makes it look more efficient to the Browser to just not evaluate anything until you want to know the value.

And then, it lies to you. ¯\_(ツ)_/¯

Two notes:

  1. If you really have to deal with legacy code of a given size where you cannot easily change that behaviour, you can log your object using JSON.stringify, i.e. console.log("let's see…", trustfulObject, JSON.stringify(deceptiveObject)); avoids that lazy evaluation.
  2. Note: Not to be confused, the JS “const” keyword does exactly the opposite of creating an immutable object. It creates an immutable reference, i.e. you can only manipulate their content afterwards. Exactly what you not want.

Of course, in modern times you probably wouldn’t write vanilla JS, and e.g. using React useState definitely reduces that issue. But still. If you don’t want to use React & Co. everywhere, then… pay attention.

The ever-connecting WebSocket

This is another of these „funny how we live in a time, where we take connectivity for granted“-posts. But what is taken for granted, usually still is somewhat cumbersome under the hood. As in our current episode.

Admittedly, the arrival of WebSockets in the last decade were one of the more significant steps towards a fluid internet experience. The WebSocket protocol is an advancement from the old „some client asks some server to handle some stuff“ way in that it is bi-directional: After mutual agreement („hand shake“), the connection stays open for the server to send data to the client, without the client having to ask first. Consider the server to be a complex application which processes lots of tasks and from time to time creates some „news“ for the client, which the user might want to read in real time.

Nowadays, the WebSocket itself is long established. What surprised us a few weeks, however – and what made us invest several days in actual research – is their behaviour when paired with loss of internet connection. Which had quite some surprise for us.

Now, this is a real scenario for one of our customers. You have a web application running on a mobile device, and this device moves in and out of WiFi-accessible areas all the time. The application should just show this circumstance and attempt to reconnect. Now the straightforward thing was to use the native WebSocket API class, or the “websocket” npm package (which acts as a small wrapper around that API); this comes with a small enough set of event handlers (onopen, onclose, onerror, onmessage). but the less obvious thing was: How is “connection lost” actually noticed? Is it onerror? Is it onclose?

In reality, this is not clear at all. Depending on the type of internet loss, there might occur a delay of several minutes until onclose fires, and onerror alone seems not to imply any closing at all. Furthermore, it depended on the type of internet loss. How do you even simulate “mobile device walked away from WiFi” as accurately as possible? While disconnecting our WiFi seemed to register with almost no delay, this was too far from the real scenario. It was only after switching to an ethernet cable and then unplugging it, that we saw the effect. And we found that the onclose event is actually quite confused if we reconnect our cable before it has fired. It could happen, then, that one old onclose did not fire until a new WebSocket was already opened, i.e. not a good indicator of “no connection” at all.

This confusion made it clear that the WebSocket technology is not as well defined as we thought it was. We actually resorted to one of the most basic ideas in order to notice our “(dis)connected” state: Continuously checking for it. Indeed – as low-level as it sounds.

We found that following solution to work quite well:

  • The server continuously sends a “heart beat” over the WebSocket. We are aware that there is a websocket.ping() method but we didn’t want to run into more surprises here.
  • WebSocket handling is done inside our own module which
    • wraps the WebSocket onmessage event in order to expect that heart beat or else “the watchdog gets angry”
    • has its own onclose event which communicates the problem to the outside as early as possible
    • also, instantly tries to reconnect
    • wraps the WebSocket onclose event in order to make it quiet if the watchdog gets angry and it would fire too late; but otherwise fire (if the watchdog is happy and the WebSocket is closed normally).

The latter implements Loose Coupling / the Principle of Least Knowledge / Separation of Concerns. We do not want our module to have a much larger interface than the original WebSocket implementation. In fact, the only information from our application to our new module is “is the user logged in”? In our application, this is part of the Redux state, but we want our module to know neither of React, Redux or other magic; it should be vanilla TypeScript in order be testable, or even better, so straightforward that any tests would be trivial.

So there we have it. If you are interested in the code, I’d be glad to share that, but the actual deed here was in finding out what we actually need.

I have no idea why the WebSocket specification is the way it is, but if you ever encounter such a problem, that would be my advice – take the thing, put it in your own thing, and couple the things loosely.

But anyway, it was fun to realize that even in 2021, a two-way-connected client-server system still might need a small guardian that tells you whether everything’s fine.

Addendum: Monkey-patching an existing class in TypeScript

I leave that here for quick reference. As stated above, we needed to equip our websocket instances with a flag to ignore their onclose events. Now some sources might readily give you the quick advice to do it as:

const socket = new w3cwebsocket(...);
(socket as any).silent = false;

But why use TypeScript if you want to work around the type system anyway? Just extend it.

class CustomWebSocket extends w3cwebsocket {
    silent: boolean = false;
    constructor(url: string) {
        super(url);
    }
}

const socket = new CustomWebSocket(...);

Understanding, identifying and fixing the N+1 query problem

One of the most common performance pitfalls for applications accessing data from databases is the so-called “N+1 query problem”, or sometimes also called the “N+1 selects problem”. It is the first thing you should look for when an application has performance issues related to database access. It is especially easy to run into with object-relational mappers (ORMs).

The problem

The problem typically arises when your entity-relationship model has a 1:n or n:m association. It exists when application code executes one query to get objects of one entity and then executes another query for each of these objects to get the objects of an associated entity. An example would be a blog application that executes one query to fetch all authors whose names start with the letter ‘B’, and then another query for each of these authors to fetch their articles. In pseudocode:

# The 1 query
authors = sql("SELECT * FROM author WHERE name LIKE 'B%'");

# The N queries
articles = []
FOR EACH author IN authors:
    articles += sql("SELECT * FROM article WHERE author_id=:aid", aid: author.id)

The first query is the “1” in “N+1”, the following queries in the loop are the “N”.

Of course, to anybody who knows SQL this is a very naive way to get the desired result. However, OR mappers often seduce their users into writing inefficient database access code by hiding the SQL queries and allowing their users to reach for the normal tools of their favorite programming language like loops or collection operations such as map. A lot of popular web application frameworks come along with OR mappers: Rails with Active Record, Grails with GORM (Hibernate based), Laravel with Eloquent.

How to detect

The easiest way to detect the problem in an application is to log the database queries. Virtually all ORMs have a configuration option to enable query logging.

For Grails/GORM the logging can be enabled per data source in the application.yml config file:

dataSource:
    logSql: true
    formatSql: true

For Rails/ActiveRecord query logging is automatically enabled in the development environment. Since Grails 5.2 the Verbose Query Logs format is enabled by default, which you had to explicitly enable in earlier versions.

For Laravel/Eloquent you can enable and access the query log with these two methods/functions:

DB::connection()->enableQueryLog();
DB::getQueryLog();

Once query logging is enabled you will quickly see if the same query is executed over and over again, usually indicating the presence of the N+1 problem.

How to fix

The goal is to replace the N+1 queries with a single query. In SQL this means joining. The example above would be written as a single query:

SELECT article.*
FROM article
JOIN author
  ON article.author_id=author.id
WHERE author.name LIKE 'B%'

The query interface of ORMs usually allows you to write joins as well. Here the example in ActiveRecord:

Article.joins(:authors).where("authors.name LIKE ?", "B%")

Another option when using ORMs is to enable eager loading for associations. In GORM this can be enabled via the fetchMode static property:

class Author {
    static hasMany = [articles: Article]
    static fetchMode = [articles: 'eager']
}

REST APIs

The problem isn’t limited to SQL databases and SQL queries. For REST APIs it’s the “N+1 requests problem”, describing the situation where a client application has to call the server N+1 times to fetch one collection resource + N child resources. Here the REST-API has to be extended or modified to serve the client’s use cases with a single request. Another option is to offer a GraphQL API instead of a REST API. GraphQL is a query language for HTTP APIs that allows complex queries, so the client application can specify exactly what resources it needs with in a single request.