^{Erik J. Larson
May 24, 2025

7

Artificial general intelligence (AGI), Artificial Intelligence, History}

The Fiction of Generalizable AI: A Tale in Two Parts

_{Why intelligence isn’t a linear scale — and why true generalization remains unsolved} _{Erik J. Larson
May 24, 2025

7

Artificial general intelligence (AGI), Artificial Intelligence, History}

Share: Facebook; Twitter/X; LinkedIn; Flipboard; Print; Email

Here I am tying together several previous discussions on the “I” in AI and the deeper differences between minds and machines. My discussion is informed by François Chollet’s excellent 2019 paper “On the Measure of Intelligence” as well as a provocative book I recently read for a Liberty Fund event, Radical Uncertainty: Decision Making Beyond the Numbers (Norton 2020). My goal is to explain where AI currently stands, what we are really trying to do when we talk about artificial general intelligence or “AGI,” — and why we are not even remotely getting the story straight.

Let’s begin.

Introduction: The Missing “I” in AI

The field of artificial intelligence or AI has somehow managed to evolve into a technological behemoth without ever clearly explaining what it means by artificial intelligence. More specifically: what is the “I” in AI?

From its inception in the 1950s, the promise of AI was loudly touted, but the yardstick of progress was curiously narrow. It was always the engineering of particular skills: playing chess, passing Turing’s playful imitation game over teletype (chat), or later, recommending movies or products to buy.

Even by 2007, researchers noted in a major survey that “to the best of our knowledge, no general survey of tests and definitions [of intelligence] has been published.” The field seemed content to showcase prowess in specialized tasks: board games, supply chain optimization, translation engines, facial recognition—systems that, as was well-publicized, might sometimes pick out wolves from dogs based not on features of the animal, but simply on whether there was snow in the background.

Through all this, commercial opportunities on the web exploded. But scientific work on what we actually mean by “intelligence” — and how we would know if it was meaningfully increasing — stalled out, or for the most part, simply didn’t exist.

What the 2007 survey by Shane Legg and Marcus Hutter did accomplish, however, was to summarize roughly 70 disparate definitions into one helpful statement:

Intelligence measures an agent’s ability to achieve goals in a wide range of environments.

Let’s unpack that.

The first condition — the ability to achieve goals — invokes task-specific skills: beating Kasparov at chess, optimizing a supply chain, or increasing sales of energy drinks at Walmart.

The second condition — in a wide range of environments — points toward what Chollet calls “generality and adaptation.” Notice how a “wide range of environments” would seem to preclude simply building in priors or manipulating data to achieve some particular objective.

In other words, the second condition is somewhat at odds with the first condition, implying that learning is intrinsic to knowing, or intelligence itself. It also suggests that the task to be performed might not be known in advance, requiring the AI not merely to apply existing skills, but to adapt to genuinely new tasks.

Notice: there’s a tension between these two parts of the definition. Maximizing goal achievement in known domains can often be accomplished by overfitting — building in strong priors, manipulating data, or tailoring solutions to very narrow circumstances. But true generalization resists this. It demands flexibility without foreknowledge of specific tasks.

Chollet highlights the fact that the Legg and Hutter definition nicely mirrors the traditional split in human psychometrics between crystallized intelligence (acquired skills, like math or vocabulary) and fluid intelligence (the flexible capacity to solve novel problems). Even in the absence of a settled scientific theory of “intelligence,” this framing — task performance plus generalization — puts us on reasonably firm conceptual ground.

Inference + Learning = Intelligence

As many readers know, in my first book, The Myth of Artificial Intelligence (Harvard 2021), I focused almost exclusively on the problem of inference: given what we already know and what we can presently observe, how should we update our beliefs? In other words: what should we now judge to be the case?

But following our (admittedly cursory) summary definition of the “I” in AI, we ought to expand the concept. Intelligence isn’t just inference; it is inference plus learning. That brings us much closer to the real goalposts.

Interestingly, just as psychometrics divides intelligence neatly into crystallized and fluid components, the field of AI itself, both in its early symbolic phase and now in its machine learning and neural network phase, similarly divides along these lines.

A Tale of Two Traditions: Symbolic AI and Machine Learning

In the 1960s and 70s (and on into the 1990s), AI scientists like Marvin Minsky (1927‒2016) envisioned AI as the accumulation and structured application of knowledge. This led to knowledge-based systems and the formal field of knowledge representation and reasoning (KR&R).

We see this vision most clearly articulated in Minsky’s influential The Society of Mind (Simon & Schuster 1986), where he portrayed the mind as a set of specialized modules — adaptive structures evolved over millennia to solve particular survival problems, often summarized under the “4 Fs” of evolutionary biology. For Minsky, these special-purpose adaptations could be engineered as modules that, when networked together, would perform a wide range of tasks. Symbolic knowledge would enable the generalization and extension necessary for intelligent behavior.

His 1968 definition set the tone:

AI is the science of making machines capable of performing tasks that would require intelligence if done by humans.

Tasks like playing chess. Or recognizing faces. Or driving.

The emphasis was always on task performance, supported by a knowledge base structured in a way the system could query and reason over. Yet as José Hernandez-Orallo pointed out in his 2017 survey, the old “GOFAI” way of doing AI by performing at human or superhuman levels on selected tasks led to a paradox: the field of artificial has been very successful in developing artificial systems that perform these tasks without featuring intelligence. As Chollet is quick to conclude, it’s a “trend that continues to this day.”

The modern view, which began to dominate with the rise of the World Wide Web and is now so entrenched it is scarcely questioned, is that intelligence lies not in performing specific tasks, but in a general ability to acquire new knowledge or skills. It is not simply the execution of the task that matters; it is the ability to learn so that the AI can perform new, as yet unseen, tasks.

This updated understanding of the “I” in AI brings us closer to a more realistic and powerful notion of intelligence. Unfortunately, it has also bogged the field down in endless promissory notes about building generalizing systems that, to date, have not materialized.

Still, better definitions matter. A more refined framing might be:

AI is the science and engineering of making machines do tasks they have never seen and have not been prepared for beforehand.

This definition, in contrast to the evolutionary psychology vision of special-purpose modules, has a rich philosophical lineage. It runs through John Locke’s idea of the Tabula Rasa—the blank slate—and traces back to Aristotle.

The notion that intelligence is the capacity to imbue the previously unseen, ungrasped, or undone with new insight and action makes intuitive sense. Enlightenment thinkers like Hobbes, Locke, and Rousseau similarly viewed the mind — and hence intelligence — as the power to turn experience into behavior.

More recently, the field of AI embraced this “blank slate” conception in the cognitive sciences, especially through connectionism — the early term for neural network architectures. Today, that view has been inflated into the dominant paradigm behind generative AI and foundational models, all based on deep learning, a machine learning approach that (mistakenly) assumes a network can start blank and be transformed into an intelligent agent simply through exposure to enough data.

Next: Part 2: The Fiction of Generalizable AI — How to Game the System