AI is super-good at guessing, but also totally clueless

AI is still somewhere in its hype phase, maybe towards the end of it. Most of us have used generative AI or use it more or less regularily.

I am experimenting with it every now and then and sometimes even use the output professionally. On the other hand I am not hyped at all. I have mostly mixed feelings (and the bad feelings are not because I fear losing my job…). Let me share my thoughts:

The situation a few years/months ago

Generative AI (or more specifically) chatgpt impressed many people but failed at really simple tasks like

  • Simple trick questions like “Tom’s father has three children. The first one is called Mark and the second Andrea. What is the name of the third child?”
  • Counting certain letters in a word

Another problem were massive hallucinations like shown in Kevlin Henney Kotlin Conf talk and completely made up summaries of a childrens book (german) I got myself.

Our current state of generative AI

Granted, nowadays many of these problems are mitigated and the AI became more useful. On the other hand new problems are found quite frequently and then worked around by the engineers. Here are some examples:

  • I asked chatgpt to list all german cities above 200k citizens. It put out nice tables scraped from wikipedia and other sources, but with a catch: they were all incomplete and the count was clearly wrong. Even after multiple iterations I did not get a correct result. A quick look at the wikipedia page chatgpt used as a source showed a complete picture.
  • If you ask about socially explosive topics like islamic terrorism, crime and controversial people like Elon Musk and Donald Trump you may get varying, questionable responses.

More often then not I feel disappointed when using generative AI. I have zero-trust in the results and end up checking and judging the output myself. In questions about code and APIs I usually have enough knowledge to judge the output or at least take it as a base for further development.

In the really good “IntelliJ Wizardry with AI Assistant Live” by Heinz Kabutz online course we also explored the possibilities, limits and integration of Jetbrains AI assistant. While it may be useful in some situations and has a great integration into the IDE we were not impressed by its power.

Generating test cases or refactoring existing code varies from good to harmful. Sometimes it can find bugs for you, sometimes it cannot…

Another personal critical view on AI

After all the ups and downs and the development and progress with AI and many thoughts and reflections about it I have discovered something that really bothers me about AI:

AI takes away most of the transparency, determinism and control that we as software developers are used to and sometimes worked hard for.

As developers we strive for understanding what is really happening. Non-determinism is one of our enemies. Obfuscated code, unclear definitions, undefined behaviour – these and many other things make me/us feel uncomfortable.

And somehow, for me AI feels the same:

I change a prompt slightly, and sometimes the result does not change at all while at other times it changes almost completely. Sometimes the results are very helpful, on another occasion they are total crap. In the past maybe they were useless, now the same prompts put out useful information or code.

This is where my bad feelings about AI come from. Someone, at some company trains the AI, engineers rules, defines the training data etc. Everything has an enourmous impact on the behaviour of the AI and the results. Everything stays outside of your influence and control. One day it works as intended, on another not anymore.

Conclusion

I do not know enough about generative AI and the mathematics, science and engineering behind it to accurately judge or predict the possibilities and boundaries for the year to come.

Maybe we will find ways to regain the transparency, to “debug” our models and prompts, to be able to reason about the output and to make generative AI reliable and predictable.

Maybe generative AI will collapse under the piles of crap it uses for training because we do not have powerful enough means of telling it how to separate trustful/truthful information from the rest.

Maybe we will use it as assistants in many areas like coding or evaluating X-ray images to sort out the unobtrusive ones.

What I really doubt at this point is that AI will replace professionals, regardless of the field. It may make them more productive or enable us to build bigger/better/smarter systems.

Right now, generative AI sometimes proves useful but often is absolutely clueless.

Trying ChatGPT as a professional software developer

We are a company of software developers that does not sell a product but provides a wide range of services. Our primary service is developing individual software or maintaining existing software for our customers.

Most of us are polyglot developers each sufficiently fluent in a couple of languages. While I see this as a strength it also means that I do not know the idiomatic solution for a task in all ecosystems in use in our projects by heart.

A few days ago I had a remote-pair-programming session with one of our customers and we wondered how to idiomatically trim certain characters from a string in JavaScript. My customer came up with the idea of asking ChatGPT for a solution while I would have asked Google or stackoverflow.

Being curious we took the ChatGPT route and it delivered a correct solution after I clarified what I mean with “trim”:

function trimCharacterFromStartAndEnd(str, char) {
  // Create a regular expression to match the specified character at the beginning and end of the string
  const regex = new RegExp(`^${char}+|${char}+$`, 'g');
  
  // Use the replace() function to remove the character from the start and end of the string
  const trimmedStr = str.replace(regex, '');
  
  return trimmedStr;
}

// Example usage
const originalString = '###Hello World!###';
const trimmedString = trimCharacterFromStartAndEnd(originalString, '#');
console.log(trimmedString); // Output: Hello World!

But from a professional perspective the code was not simple, concise and idiomatic. I intuitively knew there must be a better way.

So I did what ChatGPT does best: I chatted with it to improve the solution to arrive at a solution a was satisfied with:

function trimCharacterFromStartAndEnd(str, char) {
  return str.replace(/^${char}+|${char}+$/g, '');
}

// Example usage
const originalString = '###Hello World!###';
const trimmedString = trimCharacterFromStartAndEnd(originalString, '#');
console.log(trimmedString); // Output: Hello World!

However, you possibly need to handle regex special characters like '.', '*' etc. if they can part of your characters to trim.

Some of the intermediate steps also have their uses depending on the needed flexibility. See the full conversation at trim character from string chat.

Similarily, stackoverflow provides some comprehensive answers you can adapt to your specific situation.

Evaluation

Using ChatGPT can actually provide you useful results. To make the most out of it, you have to be able to judge the solution provided by the AI and try to push it in the wanted direction.

After my experiment our students got the inofficial advice that their solutions should not be worse than what ChatGPT delivers. 😀

Arriving at a good solution was not faster or easier than the traditional developers’ approach using Google and/or stackoverflow. Nevertheless it was more interactive, more fun and most importantly it worked.

It was a bit disappointing to lose context at some points in the conversation, with the g-flag for example. Also the “shortest” solution is longer than the variant with the regex-literal, so strictly speaking ChatGPT’s answer is wrong…

I will not radically change my style of work and jump on the AI-hype-train but I plan to continue experimenting with it every now and then.

ChatGPT and friends certainly have some potential depending on the use case but still require a competent human to judge and check the results.

The day the machines took gaming away

Just a few years ago, bots in computer games were a liability for the team. Now they are the preferred teammate. What happened?

August 5th, 2018 was a noteworthy day in the history of mankind. It was a Sunday and had Europe aching in unusual heat and drought. But more important, it was the day when the machines gently asserted their dominance in the field of gaming. It was the day when our most skilled players lost a tournament of Dota 2 against a bunch of self-learned bots.

“Bot” used to be a vilification

How did we end up in this situation? Let’s look back at what “bot” used to mean in gaming. Twenty years ago, we were thrilled about games like Starcraft where you control plenty of aggressive, but otherwise dumb units in a battle against another player that also controls plenty of those units. The resulting brawls were bloody, chaotic and ultimately overwhelming with their number of necessary tasks (so-called micromanagement) and the amount of information that needed to be processed at once to react to the opponent. In a human versus human (or pvp for player versus player) game, those battles were usually constrained to a certain area and executed with a certain laissez-faire attitude. Only the best players could stage two or more geographically independent attacks and control every unit to their full potential. We admired those players like astronauts or rockstars.

If you could not play against another human, you would start a game against a bot. A bot usually had four things that worked in their advantage and a lot of things stacked agaist them. In their favor, they had minimal delay in their reactions, ultimate precision in their commands and full information about everything on the gamefield. And more often than not, they received more game resources and other invisible cheats because they didn’t stand a chance against even moderately skilled humans otherwise. Often, their game was defined by a fixed algorithm that couldn’t adapt to human strategy and situational specifics. A very simple war of attrition was enough to defeat them if their resource supply wasn’t unlimited. They didn’t learn from their experience and didn’t cooperate, not with other bots or allied humans. These early bots relied on numbers and reaction speed to overwhelm their human counterparts. They played against our natural biological restrictions because the programmers that taught them knew about these restrictions very well.

Barely tolerated fill-ins

Those bots were so dumb and one-dimensional that playing with them against other opponents was even more of a challenge because you always had to protect them from running in the most obvious traps. They weren’t allies, they were a liability that dictated a certain game style. Everybody preferred human allies even if they made mistakes and reacted slower.

The turning point

Then, a magical thing happened. An artificial intelligence had trained itself the rules of Go, a rather simple game with only two players taking turns on a rather static gamefield. This AI played Go against itself so excessively, it mastered the game on a level that even experts could not grasp easily. In the first half of the year 2017, the machines took the game Go out of our hands and continued to play against themselves. It got so bad that an AI that was named AlphaGo Zero taught itself Go from scratch in three days and outclassed the original bot that outclassed mankind. And it seemed to play more like a human than the other bots.

So we got from dumb bots that were inferior stand-ins for real humans to overly powerful bots that even make it seem as if humans are playing in just a few years.

The present days

It should be no surprise to you anymore that on that first Sunday of August 2018, a group of bots beat our best players in Dota 2. There are a few noteworty differences to the victories in Go:

  • Dota 2 is a game where five players battle against five other players, not one versus one. It wasn’t one bot playing five characters, it was five bots cooperating with only in-game communication against humans cooperating with a speech side-channel.
  • Go is an open map game. Bot opponents see every detail of the gamefield and have the same level of information. In Dota 2, your line of sight is actually pretty limited. The bots did not see the whole gamefield and needed to reconnoiter just like their human opponents.
  • In Go, the gamefield is static if nobody changes it. In Dota 2, there a lots of units moving independently on the gamefield all the time. This fluidity of the scenario requires a lot of intuition from human players and bots alike.
  • The rules of Go are simple, but the game turns out to be complex. The rules of Dota 2 are very complex, but the game refuses to be simple, because the possibilities to combine all the special cases are endless.
  • Go is mostly about logic, while Dota 2 has an added timing aspect. Your perfect next move is only effective in a certain time window, after that, you should reconsider it.

Just a year after the machines took logic games from us (go and read about AlphaZero if you want to be depressed how fast they evolve), they have their foot in the real-time strategy sector, too. Within a few years, there is probably no computer game left without a machine player at the top of the ladder. Turns out the machines are better at leisure activities, too.

The future?

But there is a strange side-note to the story. The Go players reported that at first, the bots played like aliens. Later versions (the purely self-learned ones) had a more human-like style. In Dota 2, if you mix bots with humans in one team, the humans actually prefer the cooperation with the bots. It seems that bots could be the preferred opponent and teammate of the future. And then, it’s no longer a game of humans with a few bots as fill-in, but a game between machines, slowed down so that humans can participate and do their part – as a tolerated inferior fill-in.