AI is super-good at guessing, but also totally clueless

AI is still somewhere in its hype phase, maybe towards the end of it. Most of us have used generative AI or use it more or less regularily.

I am experimenting with it every now and then and sometimes even use the output professionally. On the other hand I am not hyped at all. I have mostly mixed feelings (and the bad feelings are not because I fear losing my job…). Let me share my thoughts:

The situation a few years/months ago

Generative AI (or more specifically) chatgpt impressed many people but failed at really simple tasks like

  • Simple trick questions like “Tom’s father has three children. The first one is called Mark and the second Andrea. What is the name of the third child?”
  • Counting certain letters in a word

Another problem were massive hallucinations like shown in Kevlin Henney Kotlin Conf talk and completely made up summaries of a childrens book (german) I got myself.

Our current state of generative AI

Granted, nowadays many of these problems are mitigated and the AI became more useful. On the other hand new problems are found quite frequently and then worked around by the engineers. Here are some examples:

  • I asked chatgpt to list all german cities above 200k citizens. It put out nice tables scraped from wikipedia and other sources, but with a catch: they were all incomplete and the count was clearly wrong. Even after multiple iterations I did not get a correct result. A quick look at the wikipedia page chatgpt used as a source showed a complete picture.
  • If you ask about socially explosive topics like islamic terrorism, crime and controversial people like Elon Musk and Donald Trump you may get varying, questionable responses.

More often then not I feel disappointed when using generative AI. I have zero-trust in the results and end up checking and judging the output myself. In questions about code and APIs I usually have enough knowledge to judge the output or at least take it as a base for further development.

In the really good “IntelliJ Wizardry with AI Assistant Live” by Heinz Kabutz online course we also explored the possibilities, limits and integration of Jetbrains AI assistant. While it may be useful in some situations and has a great integration into the IDE we were not impressed by its power.

Generating test cases or refactoring existing code varies from good to harmful. Sometimes it can find bugs for you, sometimes it cannot…

Another personal critical view on AI

After all the ups and downs and the development and progress with AI and many thoughts and reflections about it I have discovered something that really bothers me about AI:

AI takes away most of the transparency, determinism and control that we as software developers are used to and sometimes worked hard for.

As developers we strive for understanding what is really happening. Non-determinism is one of our enemies. Obfuscated code, unclear definitions, undefined behaviour – these and many other things make me/us feel uncomfortable.

And somehow, for me AI feels the same:

I change a prompt slightly, and sometimes the result does not change at all while at other times it changes almost completely. Sometimes the results are very helpful, on another occasion they are total crap. In the past maybe they were useless, now the same prompts put out useful information or code.

This is where my bad feelings about AI come from. Someone, at some company trains the AI, engineers rules, defines the training data etc. Everything has an enourmous impact on the behaviour of the AI and the results. Everything stays outside of your influence and control. One day it works as intended, on another not anymore.

Conclusion

I do not know enough about generative AI and the mathematics, science and engineering behind it to accurately judge or predict the possibilities and boundaries for the year to come.

Maybe we will find ways to regain the transparency, to “debug” our models and prompts, to be able to reason about the output and to make generative AI reliable and predictable.

Maybe generative AI will collapse under the piles of crap it uses for training because we do not have powerful enough means of telling it how to separate trustful/truthful information from the rest.

Maybe we will use it as assistants in many areas like coding or evaluating X-ray images to sort out the unobtrusive ones.

What I really doubt at this point is that AI will replace professionals, regardless of the field. It may make them more productive or enable us to build bigger/better/smarter systems.

Right now, generative AI sometimes proves useful but often is absolutely clueless.

5 thoughts on “AI is super-good at guessing, but also totally clueless”

  1. I second that the lack of reproducibility and transparency feels awful. Nonetheless, AI is helping me a lot, especially in languages I am not well accustomed with. I wonder what model you chose and what context you give it. For me, using Claude 3.7 Sonnet with max thinking, and a Cline Memory Bank with the project context makes all the difference, see https://www.linkedin.com/pulse/vise-coding-david-farago-1k5ce.

    I am curious about Heinz Kabutz’s online course: I see that each segment touches on AI, but how much? Can I transfer the insights to other languages like TypeScript, Python, and Rust?

    1. I can totally relate to the helpfulness of AI in areas where you are lacking, be it art, videos, music, writing or coding. As for model and context I for now only used the free version of chatgpt and could not provide any/much context due to legal issues. Therefore my assessment and feelings may well be scewed. And I didn’t do any vibe coding from scratch.

      As for the online course: AI was really only an add-in, not the main focus. If you are looking for a proper deep dive in AI assisted coding you have to look elsewhere. As for other languages I have no clue about the quality of Jetbrain’s assistant. Generally they are multi-language and multi-platform but quality may vary, at least in the early stages of platform support.

      Maybe we will try the self-hosted AI assistant sometime in the future to test it more thorough.

  2. This is a cool way to think about AI! It’s true that AI can guess a lot, but it also doesn’t really understand things like people do. I use AI sometimes too, and it’s interesting to see what it can and can’t do.

Leave a reply to davidfarago Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.