Programmer: Chatbots Are a Dead End. Time for a New Contest!
François Chollet is offering $1.1m in prize money for the next step on the road to computers that think like peopleAI researcher François Chollet thinks chatbots (Large Language Models or LLMs) are now a dead end as far as the development of computers that think like people is concerned:

“Almost all current AI benchmarks can be solved purely via memorization,” François Chollet, a software engineer and AI researcher, told Freethink. “You can simply look at what kind of questions are in the benchmark, then make sure that these questions, or very similar ones, are featured in the training data of your model.”
“Memorization is useful, but intelligence is something else,” he added. “In the words of Jean Piaget, intelligence is what you use when you don’t know what to do. It’s how you learn in the face of new circumstances, how you adapt and improvise, how you pick up new skills.”
Kristin Houser, “LLMs are a dead end to AGI, says François Chollet,” Freethink, August 3, 2024
Given that limitation, known chatbot problems like model collapse and hallucinations may not be fixable.
Chollet, who has published an open-access paper on AI and reasoning, thinks that Sam Altman’s OpenAI chatbots have set back the field: “I see LLMs as more of an off-ramp on the path to AGI actually. All these new resources are actually going to LLMs instead of everything else they could be going to” (June 11, 2024).
Iconic tests like the Turing test are now considered outdated. The test developed by Alan Turing (1912–1954) only measures whether people think a computer is a fellow human. With chatbots scarfing their way through the internet and often serving up plausible results from what humans put up there, the test is no longer meaningful.
Chollet and software developer Mike Knoop are offering the ARC Prize ($1,100,000 inn total prize money) to get AI to the next level:
ARC-AGI
Most AI benchmarks measure skill. But skill is not intelligence. General intelligence is the ability to efficiently acquire new skills. Chollet’s unbeaten 2019 Abstraction and Reasoning Corpus for Artificial General Intelligence (ARC-AGI) is the only formal benchmark of AGI.
It’s easy for humans, but hard for AI.
In this competition, you’ll develop AI systems to efficiently learn new skills and solve open-ended problems, rather than depend exclusively on AI systems trained with extensive datasets. The top submissions will show improvement toward human reasoning benchmarks.
The psychological test offered, Raven’s Progressive Matrices, seems quite simple but apparently eludes AI so far:
The deadline for submissions of code is November 10, 2024. Winners are to be announced December 3, 2024.
Here’s where Chollet thinks the future lies, generally:
All parties are proceeding on the basis that artificial general intelligence — computers that think like people — are indeed possible. They just don’t know how. As Houser writes at Freethink:
As for what kind of AI is most likely to lead to AGI, it’s too soon to say, but Chollet has shared details on the approaches that have performed best at ARC so far, including active inference, DSL program synthesis, and discrete program search. He also believes deep learning models could be worth exploring and encourages entrants to try novel approaches.
Houser, “Dead end to AGI”
But what if the goal is not possible in principle?
Computer engineering prof Robert J. Marks points out that computers, by definition, only compute, but humans think things that are not computable. Similarly, programmer Eric Holloway notes, computation is only one type of thinking: “The Church-Turing thesis states everything physical can be reduced to computation. Thus there is no way to build a computer that cannot be reduced to the logic of 1’s and 0’s. Therefore, no matter how sophisticated an AI becomes, ChatGPT or otherwise,” it won’t handle thinking that cannot be reduced to 1’s and 0’s.
That is probably a hard ceiling, like time travel or faster than light travel. But trying to break through that ceiling may produce illuminating discoveries.
You may also wish to read: Which is smarter? Babies or AI? Not a trick question. Humans learn to generalize from the known to the unknown without prior programming and do not get stuck very often in endless feedback loops.