When open source is only marketing

Nowadays, Open Source Software (OSS) is everywhere and probably everyone is using it somewhere. It may be your browser, the operating system on your phone, some libraries of your beloved games and apps or some web server software or infrastructure software delivering websites and other data like music or videos to you. Basically, OSS is everywhere – even if it is sometimes not openly visible.

When developing software we often use libraries and frameworks that are open source. For us it is great for a couple of reasons:

  • Often free of cost
  • We can have a look under the hood to learn how something is done
  • We can check and review the code for security issues to increase trust
  • We can improve the software, adapt it to our use case or fix issues plaguing us

In using OSS and probably contributing to it by either feedback or code we strenghen the software development community and are pushing our domain a little forward step-by-step.

Some companies noticed the positive notion about OSS and started to build their businesses on OSS. They pay developers to develop or contribute to OSS and provide services around it offering assistance, support or solutions based on their OSS. Companies like Red Hat or Nextcloud have quite some success with this model.

I personally think it is a great way of business leading to a win-win situation in the IT world: A company can make money and millions of other developers can increase their productivity or develop their own solutions standing on the shoulder of giants (aka OSS).

Open Source done wrong

Sometimes, I feel open source is only used as a label and marketing tool. Let me illustrate by an example I stumbled upon a few weeks ago:

There is an OSS javascript library with a github repository, issue tracker and so on. Sounds like a proper open source project, doesn’t it?

Well, there is an issue or maybe more a lacking feature that some people noticed and somebody even opened a pull request with a proposed fix.

The fix does not breaking anything, consists of the addition of 5 lines in one file and some CSS styles in another and only adds a feature to one component of the library.

Still, the pull request (PR) was simply put down and closed with a comment along the lines

“We do not think this is necessary, so we do not accept it.”

A follow-up issue 3 (!) years later was closed with a copy&paste comment, too.

While of course it is always the right and obligation (look at the recent supply chain attacks!) of the maintainers to decide what gets in and what does not it sometimes just has the notion of “No, not invented here”. Blocking unobstrusive, small and sensible changes without real reasons is very counter-productive. It defeats some of the main advantages of an open source solution:

Developers are not able to give back and get their improvements upstream. This leads to workarounds, forks oder separately maintained patches. That in turn means overhead and more overall effort needed to build and maintain a solution thus mitigating part of the benefits using the OSS in the first place.

Do not take me wrong: Closed/proprietary software does not even give you many of the options and advantages of OSS. It offers more of a “take it as it is” approach and leaves it up to you to decide if it is worth it or not.

Doing it right

On the other hand there are tons of well governed open source projects that really consider contributions for upstream and listen to the community. We have had several successful contributions to the Tango and Grails projects for example.

So if you or your company run an open source project I advise you to be open and to listen to your users. Working as a loosely connected distributed team with a good steering team at the helm leads to many potential win-win situations and makes the (software development) world a better place.

Swagger-ui for any JVM-based backend

We often implement web applications with a React frontend and one of a large pool of backend frameworks/technologies. These include Micronaut, .NET Framework, Javalin, Flask, Eclipse Jetty among others.

A documentation of the API that allows calling the endpoints can be very helpful even during development to illustrate the API usage. OpenAPI and implementations like Swagger and Swashbuckle fulfill this task quite well.

While many of the backend frameworks support documenting and calling the API using Swagger-UI out-of-the-box or using plugins some frameworks like Undertow do not have direct support for it. Nevertheless, it is relatively easy to add an OpenAPI documentation with a web interface to almost any backend. We demonstrate the setup here using Undertow because it is used in some of our projects.

Adding dependencies and build config

Since we mostly use Gradle for our JVM-based backends we will highlight the additions to build.gradle to make to generate the OpenAPI definition. It consists of the following steps:

  • Adding the Swagger gradle plugin
  • Adding dependencies for the required annotations
  • Configuring the resolve-task of the gradle plugin

Here is an example exerpt of our build.gradle:

plugins {
  id 'java'
  id 'application'
// ...
  id "io.swagger.core.v3.swagger-gradle-plugin" version "2.2.20"
}

dependencies {
// ...
    implementation 'io.undertow:undertow-core:2.3.12.Final'
    implementation 'io.undertow:undertow-servlet:2.3.12.Final'
    implementation 'io.swagger.core.v3:swagger-annotations:2.2.20'
    implementation 'org.jboss.resteasy:jaxrs-api:3.0.12.Final'
}

resolve {
    outputFileName = 'demo-server'
    outputFormat = 'JSON'
    prettyPrint = 'TRUE'
    classpath = sourceSets.main.runtimeClasspath
    resourcePackages = ['com.schneide.demo.server', 'com.schneide.demo.server.api']
    outputDir = layout.buildDirectory.dir('resources/main/swagger').get().asFile
}

Adding the annotations

We need to add some annotations to our code so that the OpenAPI JSON (or YAML) file will be generated.

The API root class looks like below:

@OpenAPIDefinition(info =
@Info(
        title = "Demo Server Web-API",
        version = "0.11",
        description = "REST API for the demo web application..",
        contact = @Contact(
                url = "https://www.softwareschneiderei.de",
                name = "Softwareschneiderei GmbH",
                email = "kontakt@softwareschneiderei.de")
)
)
public class ApiHandler {
    public ApiHandler() {
        super();
    }

    /**
     *  Connect our handlers
     */
    public RoutingHandler createApiHandler() {
        final RoutingHandler api = new RoutingHandler();
        api.get("/demo", new DemoHandler());
        // ...
        return api;
    }
}

We also refactored our handlers to separate the business api and the Undertow handler interface methods to generate a expressive API.

The result looks something like this:

@Path("/api/demo")
public class DemoHandler implements HttpHandler {
    public DemoHandler() {
        super();
    }

    @Override
    public void handleRequest(HttpServerExchange exchange) throws Exception {
         exchange.getResponseHeaders().put(Headers.CONTENT_TYPE, "application/json");
            final Map<String, Deque<String>> params = exchange.getQueryParameters();
            final int month = Integer.parseInt(params.get("month").getFirst());
            final int year = Integer.parseInt(params.get("year").getFirst());
        exchange.getResponseSender().send(new Gson().toJson(getDaysIn(year, month)));
    }

    @GET
    @Operation(
            summary = "Get number of days in the given month and year",
            responses = {
                    @ApiResponse(
                            responseCode = "200",
                            description = "A number of days in the given month and year",
                            content = @Content(mediaType = "application/json",
                                    schema = @Schema(implementation = Integer.class)
                            )
                    )
            }
    )
    public Integer getDaysIn(@QueryParam("year") int year, @QueryParam("year") int month) {
        return YearMonth.of(year, month).lengthOfMonth();
    }
}

When running the resolve task all of the above results in a OpenAPI definition file in build/resources/main/swagger/demo-server.json.

Swagger-UI for the API definition

Now that we have this API definition we can use it to generate clients and – more important to us – generate a web UI documenting the API and allowing to execute and demo the functionality. For this we simply download the Swagger-UI distribution and place the contents of the dist/ folder in src/main/resources/swagger-ui. We then have to let Undertow serve definition and UI like so:

class DemoServer {
    public DemoServer() {
        final GracefulShutdownHandler rootHandler = gracefulShutdown(createHandler());
        Undertow.builder().addHttpListener(8080, "localhost").setHandler(rootHandler).build().start();
    }

    private HttpHandler createHandler() {
        return path()
                .addPrefixPath("/api", new ApiHandler().createApiHandler())
                .addPrefixPath("/swagger-ui", resource(new ClassPathResourceManager(getClass().getClassLoader(), "swagger-ui/"))
                        .setWelcomeFiles("index.html"))
                .addPrefixPath("/swagger", resource(new ClassPathResourceManager(getClass().getClassLoader(), "swagger/")));
    }
}

Note: I tried using the swagger-ui webjar but was unable to configure the location (the URL) of my OpenAPI definition file. Therefore I used the plain swagger-ui download instead.

Wrapping it up

We have to do some setup work and potentially some refactorings to provide a meaninful API documentation for our backend. After this work it is mostly adding some Annotations to methods and types used in your web API.

Serving static files from a JAR with with Jetty

Web applications are often deployed into some kind of web server or web container like Wildfly, Tomcat, Apache or IIS. For single-instance services this can be overkill and serving requests can easily be done in-process without interfacing with some external web container or web server.

A proven and popular framework for an in-process web container and webserver is Eclipse Jetty for Java. For easy deployment and distribution you can build a single application archive containing everything: the static resources, web request handling, all your code and dependencies needed to run your application.

If you package your application that way there is one caveat when trying to serve static resources from this application archive. Let us have a look how to do it with Jetty.

Using the DefaultServlet for static resources on the file system

Jetty comes with a Servlet-Implementation for serving static resources out-of-the-box. It is called DefaultServlet and can be used like below:

Server server = new Server();
ServerConnector connector = new ServerConnector(server);
connector.setHost(listenAddress);
connector.setPort(listenPort);
server.addConnector(connector);
var context = new ServletContextHandler();
context.setContextPath("/");
var defaultServletHolder = context.addServlet(DefaultServlet.class, "/static/*");
defaultServletHolder.setInitParameter("resourceBase", "/var/www/static");
server.setHandler(context);
server.start();

This works great in the case of static resources available somewhere on the filesystem.

But how do we specify where the DefaultServlet can find the resources within our application archive?

Using the DefaultServlet for static in-JAR resources

The only thing that we need to change is the init-parameter called resourceBase from a normal file path to a path in the JAR. What does a path to the files inside a JAR look like and how do we construct it? It took me a while to figure it out, but here is what I came up with and it works perfectly in my use cases:

private String getResourceBase() throws MalformedURLException {
	URL resourceFile = getClass().getClassLoader().getResource("static/existing-file-inside-jar.txt");
    return new URL(Objects.requireNonNull(resourceFile).getProtocol(), resourceFile.getHost(), resourceFile.getPath()
        .substring(0, resourceFile.getPath().lastIndexOf("/")))
        .toString();
}

The method results in a string like jar:file:/path/to/application.jar!/static. Using such a path as the resourceBase for the DefaultServlet allows you to serve all the stuff from the /static directory inside your application (or any other) jar containing the class this method resides in.

A few notes on the code

Why don’t we just hardcode a path after the jar:-protocol? Well, the file path may chance in several scenarios:

  • Running the application on a different platform or operating system
  • Renaming the application archive – it could contain the version number for example…
  • Installing or copying the application archive to a different location on the file system

Using an existing resource and reusing the relevant parts of the URL-specification for the resource base directory solves all these issues because it is computed at runtime.

However, the code assumes that there is at least one resource available in the JAR and that its path is known at compile time.

In newer JDKs like 21 LTS constructing an URL using the constructor is deprecated but I did not bother to rewrite the code to use URI because of time constraints. That is left up to you or a future release…

As always I hope someone finds the code useful and drops a comment.

Improving search for special content

Nowadays many applications are very complex or handle so much data that users need and expect a fast and powerful search functionality. Popular search engine you can leverage are ElasticSearch and Solr. They both use Lucene for index management under the hood and provide a similar functionality regarding indexing and text search.

In previous posts I already gave some advice on using and improving search in your applications using these engines. For example a how-to for .NET (core) with ElasticSearch and using n-grams or wildcard fields for substring search.

Even if these frameworks work really great implementing a great search functionality can be quite hard, especially when handling special data like DOIs, IP-Adresses, Hostnames and other text containing special characters.

What’s the deal with special characters?

The standard analyzers are tuned for dealing with natural language based texts. So they split words on punctuation, whitespace and certain special characters. These are usually filtered out and not indexed. So if you index something like my-cool-hostname the dashes and the complete string will not land in the index only leaving the separate parts my, cool and hostname.

The problem with that is, that neither an exact match nor a substring like my-cool will yield any results. That is not what your users will expect…

Making your search work with special data

There are several ways to improve your search functionality for fields or texts containing different kinds of non-language strings. Here are some simple options that will improve or fix handling problematic cases and make your search work as expected by your users.

My examples use ElasticSearch features and wording, so they may be called differently for other search engines.

Using a keyword field

If you only need an exact match on some weird data field like a DOI 10.1000/182 containing punctuation and slashes, where a user normally just copy & pastes the search string from somewhere using a keyword field for indexing instead of the text type might be the better option. Usually this is easy to implement and fast for indexing and searching.

Using multiple index fields for one data field

ElasticSearch also offers the possibility to index the same data using multiple index fields. So you can keep sub-string features while adding exact match or special character support by adding different index fields like keyword above or an index field using a different analyzer (see this option below). This is called multi-field in ElasticSearch. Using multi-fields can also be used to improve sorting and scoring matches.

Using different analyzers

As I mentioned before the standard analyzers for text fields use tokenization rules and character filters useful for natural language. Sometimes you want to keep words together or preserve special characters. To implement an appropriate search for hostnames or IP addresses you could for example use a custom analyzer with a whitespace or pattern tokenizer.

Conclusion

Default full text search works great out-of-the-box in many cases. However, there are many cases of special, structured data where you need to fine-tune the way the index gets populated.

Many approaches can be combined using different analyzers and indexing a data in several ways.

There is a lot you can do to provide awesome search capabilities to your users but that requires quite some knowledge of the way the search engines work and about the data you want to be searchable.

Using JSON-Schema for data exchange

Several years ago XML was a quite popular document format – mostly due to its schema validation possibilities and clearly defined structure. Many libraries made working with such data documents possible (not really nice or a pleasure…) and humans could read them if need be. XML as a text format is programming language agnostic and processable in practically all useful programming environments.

Working with XML always was more of a pain for me. Fortunately, since then a lot of time passed and alternatives like JSON, YAML and TOML arised. All of them have their strengths and weaknesses and can fill similar roles as XML.

In general they have 2 things in common compared to XML:

  1. superiour readability
  2. lacking validation compared to XML schema

Nowadays, JSON is very widespread due to the popularity of JavaScript and perhaps the most used data exchange format across the internet. Despite having some syntactic quirks like its strictness about commas and forbidding of comments it is imho quite a good format. It is concise, human-readable, flexible and relatively simple. Many languages treat it like nested dictionaries so understanding and working with JSON is easy.

The main drawback is missing documentation and validation.

Enter JSON schema

JSON schema is a specification with accompanying libraries to fix the major issues about JSON. You define a schema of your data documents in JSON and put it in separate files. This adds the missing features to your JSON data documents I complained about: documentation and the possibility of automatic validation.

How does a simple JSON schema file look like? Let us have a look:

{
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "$id": "https://cars.softwareschneiderei.com/car.schema.json",
  "title": "Car",
  "description": "Describing some important properties of a car",
  "type": "object",
  "properties": {
    "manufacturer": {
      "description": "The company producing the car.",
      "type": "string"
    },
    "model": {
      "description": "The name of the car model.",
      "type": "string"
    },
    "engineType": {
      "description": "One of the available engine types.",
      "enum": [
        "gasoline",
        "diesel",
        "hybrid",
        "electric"
      ]
    },
    "availableColors": {
      "description": "The colors the car is available in. Some colors may increase the price.",
      "type": "array",
      "items": {
        "type": "string"
      },
      "minItems": 1,
      "uniqueItems": true
    },
    "price": {
      "description": "The price tag in € including VAT. The price is optional.",
      "type": "number"
    }
  },
  "required": [
    "manufacturer",
    "model",
    "engine",
    "availableColors"
  ]
}

A valid data document could look like below:

{
  "manufacturer": "Porsche",
  "model": "911 Turbo",
  "engineType": "gasoline",
  "availableColors": [ "black", "blue", "red", "yellow", "white" ],
  "price": 150000
}

I think we can easily see how the documentation helps to understand the data, for example regarding the price property. The Java code to validate the data may look similar to this:

public void processCarData(String carFileName) { 
  var schemaDefinition = Json.decodeValue(readFile("car.schema.json"));
  JsonSchema schema = JsonSchema.of((JsonObject) schemaDefinition);
  var schemaValidator = Validator.create(schema, new JsonSchemaOptions()
      .setDraft(Draft.DRAFT202012)
      .setBaseUri("https://cars.softwareschneiderei.com"));
  var car = Json.decodeValue(readFile(carFileName));

  if (!schemaValidator.validate(car).getValid()) {
    throw new IllegalArgumentException("The format of the car data is invalid.");
  }
  // Data valid, work with it...
}

Conclusion

As depicted above JSON schema fills some important gaps when using JSON as an data exchange format. It offers helpful tools to document data structure and to validate data against a definition written in JSON itself.

That way we can safely work with the data, document the structure and still maintain the other good properties of JSON like interoperability, human-readability and data effienciency.

Converting a Grails app from war deployment in Tomcat to docker

A few years ago we had real servers with servlet containers installed and managed by administrators. These machines were bare-metal unicorns and had to be kept alive as long as possible (for many years). With the advent of first virtual machines and then container runtimes like Docker the approach and responsibilities for hosting web applications changed.

Our application not only went from Grails framework version 1 to 5 (atm, and soon to version 6) but also the above voyage regarding the hosting environment.

While the step from bare-metal to virtual machine was negligible from the developer and application side the migration to docker-based hosting is quite a step. Therefore I would like to depict the biggest changes of running a Grails application as a WAR deployment in Tomcat to a standalone application in Docker.

The original setting

We had our Grails application running on a server with a Tomcat servlet container for years. On the server we had also an ElasticSearch instance running and saved documents in a few well-defined directories on the local file system. Our database (Oracle in the past, PostgreSQL for over a year now) was running managed in a data center. We used container features like JNDI for parts of the configuration and datasources.

Deployment was essentially uploading a new WAR and context.xml file to the Tomcat servlet container using an Ansible playbook (where we restarted the servlet container to ensure not leaving any resource leaks…).

This setup worked quite well for years, only upgrades of the host operating system along with Tomcat and Java Virtual Machine (JVM) updates meant some work.

The target setup

Our customer has been in the process of converting most of the services running on virtual machines to dockerized deployments managed using Portainer. Most of the provisioning is automated and there is almost no snow flaking of the virtual machines hosting the docker runtimes. This eases the operation and administration of the services a lot and makes it much easier to setup new instances of a service and to monitor them.

So it was our task to migrate our setup on the the host VM to a docker-based deployment. In the future we basically push a docker image of our application to a docker registry of the customer and update the stack using Portainer.

The journey to a dockerized Grails app

The first thing to do were some changes to build.gradle to build a Jar-application instead of a WAR archive. We decided it was easier and more lightweight to run the Grails app standalone with the embedded tomcat than to use Tomcat in a docker container:

  • Remove the gradle war plugin
  • Change providedCompile dependencies to implementation
  • Add a task to prepare the Dockerfile and context to build our bootable application image:
task prepareDocker(type: Copy, dependsOn: assemble) {
    description = 'Copy files from src/main/docker and application jar to Docker temporal build directory'
    group = 'Docker'

    from 'src/main/docker'
    from project.bootJar

    into dockerBuildDir
}
  • Add an appropiate Dockerfile to our docker context in src/main/docker:
FROM eclipse-temurin:11-jdk-jammy
MAINTAINER Mihael Koep "mihael.koep@softwareschneiderei.de"

EXPOSE 8080

WORKDIR /app
COPY naomi-boot.jar .

ENV GRAILS_ENV development
# Additional env variables for configuration

CMD java -Dgrails.env=$GRAILS_ENV -jar /app/application-boot.jar
  • Remove JNDI usages and convert configuration from context.xml and JNDI to environment variables
  • Add a docker-compose.yml for the stack definition (here outlined for development with a database as part of the stack)
version: '3.7'
services:
  my-app:
    container_name: my-app-application
    image: ${IMAGE_TAG}
    environment:
      - GRAILS_ENV=development
      - DB_URL=jdbc:postgresql://my-app-db:5432/my-app
    volumes:
      - my-appdata:/app/data_home
    ports:
      - 8080:8080
    depends_on:
      - my-app-db
      - my-app-elastic
  my-app-db:
    container_name: my-app-database-stack
    image: postgres:13
    environment:
      - POSTGRES_USER=my-app
      - POSTGRES_PASSWORD=my-db-password
      - POSTGRES_DB=my-app
    ports:
      - 5432:5432
  my-app-elastic:
    container_name: my-app-elastic-stack
    image: docker.elastic.co/elasticsearch/elasticsearch:7.8.0
    environment:
      - node.name=es01
      - cluster.name=my-app-es
      - discovery.type=single-node
      - bootstrap.memory_lock=true
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    volumes:
      - searchdata:/usr/share/elasticsearch/data
    ports:
      - 9200:9200
      - 9300:9300
volumes:
  searchdata:
  my-appdata:

Conclusion

The journey from classical deployments to a dockerized environment using a docker stack is not that long for a Grails application (or most other Java web applications). It makes it trivial setting up a new instance or migrating it to another host as most configuration is in version controlled files and there is next to none snowflaking. The customer has an easier time running the web application because the requirements are only a container-runtime and explicitly formulated in the stack definition.

In addition, the developers are free to choose the JVM and other details within the containers without having to cross-check with the administrators of the customer if they are supported by their organization.

Exploring Tango Admin Devices

We are using the open-source control system framework TANGO in several projects where coordinated control of multiple hardware systems is needed.

What is TANGO good for?

TANGO provides uniform, distributed access and control of heterogeneous hardware devices. It is object-oriented in nature and usually one hardware device is represented by one (or more) software devices.

The device drivers can be written in either C++, Java or Python and client libraries exist for all of these languages. Using a middleware adapter like TangoGQL any language can access the devices.

All that makes TANGO useful for building SCADA systems ranging from a handful controlled devices to several hundreds you want to supervise and control.

Lesser know features of TANGO

All of the above is well known in the TANGO and SCADA community and quite straightforward. What some people may not know is that TANGO automatically provides an Admin-device for each TANGO server (an executable running one or more TANGO devices).

These admin devices have an address of the form dserver/<server_name>/<instance_name> and provide numerous commands for controlling and querying the TANGO device server instance:

You can for example introspect the device server to find available device classes, device instances and needed device properties (think of them as configuration settings).

In addition to introspection you can also control some aspects of the TANGO server like polling and logging. The Admin-device also allows restarting individual devices or even the whole server instance. This can be very useful to apply configuration changes remotely without shell access or something similar to the remote machine.

Wrapping it up

Admin-devices automatically exist and run for each TANGO device server. Using them allows clients to explore what devices are available, what they offer and how they can be configured. They also allow some aspects to be changed remotely at runtime.

We use these features to provide a rich web-base UI for managing the control system in a convenient way instead of relying on the basic tools (like Jive and Astor) that TANGO offers out-of-the-box.

Trying ChatGPT as a professional software developer

We are a company of software developers that does not sell a product but provides a wide range of services. Our primary service is developing individual software or maintaining existing software for our customers.

Most of us are polyglot developers each sufficiently fluent in a couple of languages. While I see this as a strength it also means that I do not know the idiomatic solution for a task in all ecosystems in use in our projects by heart.

A few days ago I had a remote-pair-programming session with one of our customers and we wondered how to idiomatically trim certain characters from a string in JavaScript. My customer came up with the idea of asking ChatGPT for a solution while I would have asked Google or stackoverflow.

Being curious we took the ChatGPT route and it delivered a correct solution after I clarified what I mean with “trim”:

function trimCharacterFromStartAndEnd(str, char) {
  // Create a regular expression to match the specified character at the beginning and end of the string
  const regex = new RegExp(`^${char}+|${char}+$`, 'g');
  
  // Use the replace() function to remove the character from the start and end of the string
  const trimmedStr = str.replace(regex, '');
  
  return trimmedStr;
}

// Example usage
const originalString = '###Hello World!###';
const trimmedString = trimCharacterFromStartAndEnd(originalString, '#');
console.log(trimmedString); // Output: Hello World!

But from a professional perspective the code was not simple, concise and idiomatic. I intuitively knew there must be a better way.

So I did what ChatGPT does best: I chatted with it to improve the solution to arrive at a solution a was satisfied with:

function trimCharacterFromStartAndEnd(str, char) {
  return str.replace(/^${char}+|${char}+$/g, '');
}

// Example usage
const originalString = '###Hello World!###';
const trimmedString = trimCharacterFromStartAndEnd(originalString, '#');
console.log(trimmedString); // Output: Hello World!

However, you possibly need to handle regex special characters like '.', '*' etc. if they can part of your characters to trim.

Some of the intermediate steps also have their uses depending on the needed flexibility. See the full conversation at trim character from string chat.

Similarily, stackoverflow provides some comprehensive answers you can adapt to your specific situation.

Evaluation

Using ChatGPT can actually provide you useful results. To make the most out of it, you have to be able to judge the solution provided by the AI and try to push it in the wanted direction.

After my experiment our students got the inofficial advice that their solutions should not be worse than what ChatGPT delivers. 😀

Arriving at a good solution was not faster or easier than the traditional developers’ approach using Google and/or stackoverflow. Nevertheless it was more interactive, more fun and most importantly it worked.

It was a bit disappointing to lose context at some points in the conversation, with the g-flag for example. Also the “shortest” solution is longer than the variant with the regex-literal, so strictly speaking ChatGPT’s answer is wrong…

I will not radically change my style of work and jump on the AI-hype-train but I plan to continue experimenting with it every now and then.

ChatGPT and friends certainly have some potential depending on the use case but still require a competent human to judge and check the results.

Avoid fragmenting your configuration

Nowadays configuration often is done using environment (aka ENV) variables. They work great using docker/containers, in development and production, on all platforms and using all languages. In short I think environment variables are great for configuration of many aspects of an application.

However, I encountered a pattern in several different applications that I really dislike: Several, fragmented ENV variables for one configurable aspect of the application.

Let us have a look at two examples to see what I mean, then I will try to explain where it could come from and why I think it is bad practice. Finally I will show a better alternative – at least in my opinion.

First real world example

In one javascript app a websocket url was made configurable using 4 (!) ENV variables like this:

WS_PREFIX || "wss://";
WS_HOST || "hostname";
WS_PORT || "";
WS_PATH || "/ws";

function ConnectionString(prefix, host, port, path) {
  return {
    attrib: {
      prefix, 
      host,
      port,
      path,
    },
    string: prefix + host + port + path,
  };
}

We immediately see, that the author wrote a function to deal with the complex configuration in the rest of the application. Not only the devops team or administrators need to supply many ENV variables but they have to supply them in a peculiar way:

The port needs to be specified as :8888, using a leading colon (or the host needs a trailing colon…) which is more than unexpected. The alternative would be a better and more sophisticated implementation of ConnectionString…

Another real example

In the following example the code there are again three ENV variables dealing with hosts, urls and websockets. This examples feels quite convoluted, is hard to understand and definitely needs a refactoring.

TANGOGQL_SOCKET=ws://${TANGO_HOST}:5004/socket

const defaultHost = window.TANGOGQL_HOST ?? "localhost:5004";
const defaultSocketUrl = window.TANGOGQL_SOCKET ?? ws://${defaultHost}/socket;

// dealing with config peculiarities somewhere else
const socketUrl = React.useMemo(() =>
        config.host.replace(/.*:\/\//, "ws://") + "/socket"
    , [config.host]);

Discussion

The examples show clearly that something simple like a configuration for an URL can lead to complicated and hard to use solutions. Most likely the authors tried to not repeat themselves and factored the URLs into the smallest sensible components. While this may sound like a good idea it puts burden on both the developers and the devops team configuring the application.

In my opinion it would be much simpler and more usable for both parties to have complete URLs for the different use cases. Of course this could mean repeating protocols, hostnames and ports if they are the same in the different situations. But just having one or two ENV variables like

WS_URL=wss://myhost:8080/ws
HOST_URL=https://myhost:8080

would be straightforward to use in code and to be configured in the runtime environment. At the same time the chance for errors and the complexity in the configuration is reduced.

Even though certain parts of the URLs are duplicated in the configuration I highly prefer this approach over the presented real world solutions.

Avoid special values of the result type for error indication

As many of you may know we work with a variety of programming languages and ecosystems with very different code bases. Sometimes it may be a modern green field project using state of the art frameworks. At other times it may be a dreaded legacy project initially written many years ago (either by us or someone we do not even know) using ancient languages and frameworks like really old java stuff (pre jdk 7) or C++ (pre C++11), for example.

These old projects could not use features of modern incarnations of these languages/compilers/environments – and that is fine with me. We usually gradually modernize such systems and try to update the places where we come along to fix some issues or implement new features.

Over the years I have come across a pattern that I think is dangerous and easily leads to bugs and harder to maintain code:

Special values of the resulting type of a function to indicate errors

The examples are so numerous and not confined to a certain programming environment that they urged me to write this article. Maybe some developers using this practice will change their mind and add a few tools to their box to write safer and more expressive code.

A simple example

Let us image a function that returns a simple integer number like this:

/**
 * Here we talk to a hardware sensor. If everything works, we should
 * get a value between -50 °C and +50 °C.
 * If something goes wrong, we return -9999.
int readAmbientTemperature();

Given the documentation, clients can surely use this kind of function and if every use site interprets the result correctly, nothing will ever go wrong. The problem here is, that we need a lot of domain knowledge and that we have to check for the special value.

If we use this pattern for other values where the value range is not that clearly bounded we may either run into problems or invent other “impossible values” for each use case.

If we forget to check for the special value the users may see it an be confused or even worse it could be used in calculations.

The problem even gets worse with more flexible types like floating point numbers or strings where it is harder to compare and divide valid results from failure indicators.

Classic error message that mixes technical code and error message in a confusing, albeit funny sentence (Source: Interface Hall Of Shame)

Of course, there are slightly better alternatives like negative numbers in a positive-only domain function or MAX_INT, NaN or the like provided by most languages.

I do not find any of the above satisfying and good enough for production use.

Better alternatives

Many may argue, that their environment lacks features to implement distinct error indicators and values but I tend to disagree and would like to name a few widely used alternatives for very different languages and environments:

  • Return codes and out-parameters for C-like languages like in the unix and win32 APIs (despite all their other flaws… 😀 )
  • Exceptions for Java, Python, .NET and maybe in some cases even C++ with sufficiently specific type and details to differentiate different failures
  • Optional return types when the failures do not need special handling and absence of a value is enough
  • HTTP status code (e.g. 400 or 404) and a JSON object containing reason and details instead of a 2xx status with the value
  • A result struct or object containing execution status and either a value on success or error details on failure

Conclusion

I am aware that I probably spent way too much words on such a basic topic but I think the number of times I have encountered such a style – especially in code of autodidacts, but also professionals – justifies such an article in my opinion. I hope I provided some inspiration for those who do not know better or those who want to help others improve.