Mind Matters Natural and Artificial Intelligence News and Analysis
cutting-edge-robotic-chatbot-utilizing-ai-for-support-in-the-738084904-stockpack-adobestock
Cutting-edge robotic chatbot utilizing AI for support in the technology and corporate sectors.
Image Credit: ckybe - Adobe Stock

The fundamental mistake programmers made in designing AIs

Share
Facebook
Twitter/X
LinkedIn
Flipboard
Print
Email

At Futurism, Victor Tangermann reports that Sam Altman’s company, OpenAI, thinks it has figured out the origin of the hallucinations that plague large language models (LLMs) like ChatGPT.

Chatbot digital tablet artificial intelligence communication concept. Chatbot is new trend in B2C communication with conversational AI applicationImage Credit: jirsak - Adobe Stock

The bad news is that it’s unclear what can be done about it.

In a paper published last week, a team of OpenAI researchers attempted to come up with an explanation. They suggest that large language models hallucinate because when they’re being created, they’re incentivized to guess rather than admit they simply don’t know the answer.

Hallucinations “persist due to the way most evaluations are graded — language models are optimized to be good test-takers, and guessing when uncertain improves test performance,” the paper reads.

Conventionally, the output of an AI is graded in a binary way, rewarding it when it gives a correct response and penalizing it when it gives an incorrect one.

In simple terms, in other words, guessing is rewarded — because it might be right — over an AI admitting it doesn’t know the answer, which will be graded as incorrect no matter what.

“OpenAI Realizes It Made a Terrible Mistake,” September 14, 2025

From the open access paper: “This “epidemic” of penalizing uncertain responses can only be addressed through a socio-technical mitigation: modifying the scoring of existing benchmarks that are misaligned but dominate leaderboards, rather than introducing additional hallucination evaluations. This change may steer the field toward more trustworthy AI systems.”

Is the error fundamental to that type of system? The company argues that the problem is fixable but just how easy will that be? As Tangerman says, “For now, the AI industry will have to continue reckoning with the problem as it justifies tens of billions of dollars in capital expenditures and soaring emissions.”

And meanwhile, “GPT-5 Is Making Huge Factual Errors, Users Say.”

Indeed. You may also wish to look at economist Gary Smith’s dialogues with error-prone GPT-5: here, here, and here.


Enjoying our content?
Support the Walter Bradley Center for Natural and Artificial Intelligence and ensure that we can continue to produce high-quality and informative content on the benefits as well as the challenges raised by artificial intelligence (AI) in light of the enduring truth of human exceptionalism.

The fundamental mistake programmers made in designing AIs