AI is super-good at guessing, but also totally clueless

AI is still somewhere in its hype phase, maybe towards the end of it. Most of us have used generative AI or use it more or less regularily.

I am experimenting with it every now and then and sometimes even use the output professionally. On the other hand I am not hyped at all. I have mostly mixed feelings (and the bad feelings are not because I fear losing my job…). Let me share my thoughts:

The situation a few years/months ago

Generative AI (or more specifically) chatgpt impressed many people but failed at really simple tasks like

  • Simple trick questions like “Tom’s father has three children. The first one is called Mark and the second Andrea. What is the name of the third child?”
  • Counting certain letters in a word

Another problem were massive hallucinations like shown in Kevlin Henney Kotlin Conf talk and completely made up summaries of a childrens book (german) I got myself.

Our current state of generative AI

Granted, nowadays many of these problems are mitigated and the AI became more useful. On the other hand new problems are found quite frequently and then worked around by the engineers. Here are some examples:

  • I asked chatgpt to list all german cities above 200k citizens. It put out nice tables scraped from wikipedia and other sources, but with a catch: they were all incomplete and the count was clearly wrong. Even after multiple iterations I did not get a correct result. A quick look at the wikipedia page chatgpt used as a source showed a complete picture.
  • If you ask about socially explosive topics like islamic terrorism, crime and controversial people like Elon Musk and Donald Trump you may get varying, questionable responses.

More often then not I feel disappointed when using generative AI. I have zero-trust in the results and end up checking and judging the output myself. In questions about code and APIs I usually have enough knowledge to judge the output or at least take it as a base for further development.

In the really good “IntelliJ Wizardry with AI Assistant Live” by Heinz Kabutz online course we also explored the possibilities, limits and integration of Jetbrains AI assistant. While it may be useful in some situations and has a great integration into the IDE we were not impressed by its power.

Generating test cases or refactoring existing code varies from good to harmful. Sometimes it can find bugs for you, sometimes it cannot…

Another personal critical view on AI

After all the ups and downs and the development and progress with AI and many thoughts and reflections about it I have discovered something that really bothers me about AI:

AI takes away most of the transparency, determinism and control that we as software developers are used to and sometimes worked hard for.

As developers we strive for understanding what is really happening. Non-determinism is one of our enemies. Obfuscated code, unclear definitions, undefined behaviour – these and many other things make me/us feel uncomfortable.

And somehow, for me AI feels the same:

I change a prompt slightly, and sometimes the result does not change at all while at other times it changes almost completely. Sometimes the results are very helpful, on another occasion they are total crap. In the past maybe they were useless, now the same prompts put out useful information or code.

This is where my bad feelings about AI come from. Someone, at some company trains the AI, engineers rules, defines the training data etc. Everything has an enourmous impact on the behaviour of the AI and the results. Everything stays outside of your influence and control. One day it works as intended, on another not anymore.

Conclusion

I do not know enough about generative AI and the mathematics, science and engineering behind it to accurately judge or predict the possibilities and boundaries for the year to come.

Maybe we will find ways to regain the transparency, to “debug” our models and prompts, to be able to reason about the output and to make generative AI reliable and predictable.

Maybe generative AI will collapse under the piles of crap it uses for training because we do not have powerful enough means of telling it how to separate trustful/truthful information from the rest.

Maybe we will use it as assistants in many areas like coding or evaluating X-ray images to sort out the unobtrusive ones.

What I really doubt at this point is that AI will replace professionals, regardless of the field. It may make them more productive or enable us to build bigger/better/smarter systems.

Right now, generative AI sometimes proves useful but often is absolutely clueless.

Trying ChatGPT as a professional software developer

We are a company of software developers that does not sell a product but provides a wide range of services. Our primary service is developing individual software or maintaining existing software for our customers.

Most of us are polyglot developers each sufficiently fluent in a couple of languages. While I see this as a strength it also means that I do not know the idiomatic solution for a task in all ecosystems in use in our projects by heart.

A few days ago I had a remote-pair-programming session with one of our customers and we wondered how to idiomatically trim certain characters from a string in JavaScript. My customer came up with the idea of asking ChatGPT for a solution while I would have asked Google or stackoverflow.

Being curious we took the ChatGPT route and it delivered a correct solution after I clarified what I mean with “trim”:

function trimCharacterFromStartAndEnd(str, char) {
  // Create a regular expression to match the specified character at the beginning and end of the string
  const regex = new RegExp(`^${char}+|${char}+$`, 'g');
  
  // Use the replace() function to remove the character from the start and end of the string
  const trimmedStr = str.replace(regex, '');
  
  return trimmedStr;
}

// Example usage
const originalString = '###Hello World!###';
const trimmedString = trimCharacterFromStartAndEnd(originalString, '#');
console.log(trimmedString); // Output: Hello World!

But from a professional perspective the code was not simple, concise and idiomatic. I intuitively knew there must be a better way.

So I did what ChatGPT does best: I chatted with it to improve the solution to arrive at a solution a was satisfied with:

function trimCharacterFromStartAndEnd(str, char) {
  return str.replace(/^${char}+|${char}+$/g, '');
}

// Example usage
const originalString = '###Hello World!###';
const trimmedString = trimCharacterFromStartAndEnd(originalString, '#');
console.log(trimmedString); // Output: Hello World!

However, you possibly need to handle regex special characters like '.', '*' etc. if they can part of your characters to trim.

Some of the intermediate steps also have their uses depending on the needed flexibility. See the full conversation at trim character from string chat.

Similarily, stackoverflow provides some comprehensive answers you can adapt to your specific situation.

Evaluation

Using ChatGPT can actually provide you useful results. To make the most out of it, you have to be able to judge the solution provided by the AI and try to push it in the wanted direction.

After my experiment our students got the inofficial advice that their solutions should not be worse than what ChatGPT delivers. 😀

Arriving at a good solution was not faster or easier than the traditional developers’ approach using Google and/or stackoverflow. Nevertheless it was more interactive, more fun and most importantly it worked.

It was a bit disappointing to lose context at some points in the conversation, with the g-flag for example. Also the “shortest” solution is longer than the variant with the regex-literal, so strictly speaking ChatGPT’s answer is wrong…

I will not radically change my style of work and jump on the AI-hype-train but I plan to continue experimenting with it every now and then.

ChatGPT and friends certainly have some potential depending on the use case but still require a competent human to judge and check the results.