^{Erik J. Larson
June 26, 2025

9

Artificial Intelligence, Education}

Is the Reverse Flynn Effect — Declining Intelligence — Real?

_{IQ tests were never meant to measure memorization or familiarity, yet that’s precisely what’s happening} _{Erik J. Larson
June 26, 2025

9

Artificial Intelligence, Education}

Share: Facebook; Twitter/X; LinkedIn; Flipboard; Print; Email

Note: The first part of this essay, The reverse Flynn effect and the decline of intelligence is here.

In the 1980s, New Zealand researcher James Flynn (1934‒2020), working at the University of Otago, made a surprising discovery: since the 1930s, the average IQ in Western countries had been steadily rising — by almost three points per decade. This phenomenon, now widely accepted and known as the Flynn Effect, seemed to suggest that each new generation was significantly more intelligent than the last. But something wasn’t adding up.

If the Flynn Effect were a real, lived experience, we should notice it. A Gen Z teenager, for instance, should be demonstrably sharper than a Baby Boomer at the same age. And yet, we are not generally struck by the penetrating wisdom of the younger crowd (though sometimes we are!). Nor do we believe that our grandparents were dullards.

In other words, we don’t experience the Flynn Effect, even though on paper, it’s significant — an 18-point increase over two generations.

Something is off

In fact, just as the Flynn Effect gained scientific traction and began percolating into pop culture, evidence started emerging that IQ scores were actually in decline.

IQ test Result, Very Superior Intelligence Quotient.

The first cracks in the story around the Flynn Effect came from declining STEM (science, technology, engineering, and math) scores, despite aggressive government and educational initiatives aimed at bolstering them. In 2000, the Engineering Council of London published a major study warning of a “serious decline” in students’ mathematical and scientific competence. Perhaps, we might reason, this trend could be explained by poor teaching, distractions, or changing educational priorities rather than a true cognitive decline.

But then came evidence from a very different kind of test.

The “Volume and Heaviness” test — one of a number of Piagetian stage tests designed to assess fluid intelligence (Gf) — measures a child’s ability to grasp a fundamental concept: When a solid object is submerged in water, it displaces an amount of water based on its volume, not its weight.

In 1976, 54% of boys and 27% of girls answered this correctly. By 2003/2004, the gender gap had disappeared — because both scores had plummeted to 17%.

What are we to make of this?

One immediate clue is the disappearance of the gender disparity. The dramatic drop in boys’ scores suggests an environmental factor — likely changes in the school system or broader cultural shifts affecting how children engage with learning. Other Piagetian tests suggest the same pattern: basic reasoning skills seem to be deteriorating.

This brings us to a new and troubling phenomenon: the Reverse Flynn Effect.

While declining STEM scores and Piagetian test results suggest that we’re somehow failing our kids — particularly boys — there’s an even deeper issue: evidence suggests that IQ itself is dropping. This is perplexing (as is the Flynn Effect itself), because the whole point of fluid intelligence (Gf) is that it’s supposed to be impervious to cultural and environmental factors. Gf is a test of raw cognitive power — not schooling quality, not curriculum shifts, and not whether an educational system has been subtly feminized in ways that frustrate boys’ development.

Again: what gives?

Even Flynn himself, whose work originally identified the steady rise of IQ scores, acknowledged that the trend has reversed in multiple developed nations. Compulsory military IQ tests from Finland, Norway, Denmark, Estonia, Britain, France, the Netherlands, and Australia all show a decline beginning in the mid-1990s. As Flynn noted:

“Looming over all is their message that the pool of those who reach the top level of cognitive performance is being decimated: fewer and fewer people attain the formal level at which they can think in terms of abstractions and develop their capacity for deductive logic and systematic planning.”

If Gf is truly independent of education, culture, and teaching methods, how can it be falling?

Teaching to the test vs. true intelligence

One explanation is what we might expect: as IQ tests have grown more culturally central, families (especially wealthy ones) have gamed the system. Competitive parents now prep their children for high IQ scores in the same way they prep them for SATs or college admissions essays. The Ivy League grind has made “teaching to the test” a cottage industry — even when the test in question is supposed to measure innate problem-solving ability.

This has led to an ironic distortion:

IQ test preparation does work — sort of. Research suggests that children retaking IQ tests can expect to gain about three points from exposure alone.
The effect is particularly visible on Raven’s Progressive Matrices, a shape-based IQ subtest designed to be as culturally neutral as possible. Raven’s Matrices were once considered the gold standard for measuring pure reasoning ability — but now, after decades of practice and coaching, they may have lost their diagnostic power.
The top performers have experienced the steepest drop. This makes sense: when “average” students get better at test-taking, it narrows the performance gap at the high end. Flynn speculated that true cognitive elites are disappearing, as raw problem-solving ability is diluted by familiarity and training effects.

Measuring intelligence vs. memorizing patterns

IQ tests were never meant to measure memorization or familiarity, yet that’s precisely what’s happening. Here’s the key point: Following a well-practiced procedure isn’t the same as solving a novel problem from understanding.

If IQ tests no longer measure what they were designed to measure, then declining scores may not reflect a drop in intelligence — but a shift in what the tests themselves reveal. This distinction matters immensely, especially when evaluating AI systems that excel at pattern recognition yet struggle with true abstraction and generalization.

And there are further problems still.

The Reverse Flynn Effect: Are we actually getting dumber?

In spite of our best modern efforts, intelligence really does seem to be declining. The Reverse Flynn Effect appears to be real.

Two pieces of evidence stand out:

Physiological markers—reaction time and color acuity—both correlate with IQ yet remain impervious to training effects. Unlike IQ tests, these measures can’t be taught, practiced, or gamed. The fact that both are declining makes the case for the Reverse Flynn Effect particularly damning.
The dominant mode of thinking in the modern world has increasingly favored abstraction and reduction, hallmarks of the left hemisphere’s approach to cognition. Since the Scientific and Industrial Revolutions, these cognitive styles have come to define how we assess intelligence itself.

When IQ tests were first administered in the early 20th century, test-takers used a broader range of cognitive skills. But as IQ testing evolved to emphasize abstract categorization and formal reasoning, children adapted to that narrower cognitive framework. What we now call “intelligence” has gradually become detached from the embodied, contextualized reasoning that characterized earlier generations.

As McGilchrist notes, “From the beginning of IQ testing, purely formal categorization has been permitted to score more highly than concrete categorization.”

Flynn himself put it more bluntly:

A person who views the world through pre-scientific spectacles thinks in terms of the categories that order perceived objects and functional relationships. When presented with a Similarities-type item such as ‘What do dogs and rabbits have in common?’ Americans in 1900 would be likely to say, ‘You use dogs to hunt rabbits.’ The correct answer — that they are both mammals — assumes that the important thing about the world is to classify it in terms of the taxonomic categories of science. Even if the subject were aware of those categories, the correct answer would seem absurdly trivial. Who cares that they are both mammals? That is the least important thing about them from this point of view.

McGilchrist drives the point home:

Given this drift toward rewarding decontextualized thinking, it makes sense that today’s children are better at it: they do not have a rich web of naturally embodied experience to unlearn. What looked like a rise, for a while, in IQ [the Flynn Effect] may have been the increasing adoption of Flynn’s ‘scientific spectacles,’ and a tendency to award higher marks for such a view.

So, are we dumber?

More seriously, the key to our intelligence — as opposed to a machine’s — lies in the right hemisphere’s non-computational capacities. Our fluid intelligence (Gf) is not just about classification and logical abstraction, but about integration, connectivity, and holistic understanding — abilities that do not map neatly onto left-hemisphere-driven computation.

The right hemisphere’s advantage lies in its ability to synthesize disparate forms, recognize metaphor and other cross-context relationships, and grasp meaning absent explicit rules. These are the very capacities that artificial intelligence lacks.

But as we construct a world that increasingly rewards computational thinking, we risk further diminishing the very aspects of intelligence that make us uniquely human. The Reverse Flynn Effect may have started as an academic curiosity, but I think it’s more of a modern warning. If we continue down this path, we may create a world that thinks in categories and abstractions while steadily losing the ability to truly understand.

How we stop this slide must be one of the defining challenges of our time.

Here’s the first part of this essay by Erik J. Larson: The reverse Flynn effect and the decline of intelligence How our modern world is making us dumber and why it doesn’t have to. A growing body of neuroscience evidence directly challenges the prevailing theory that we are merely flawed computers, to be replaced by machines.