Mind Matters Natural and Artificial Intelligence News and Analysis
Nerd with glasses hacking websites
Nerd with glasses hacking websites

Beware of Geeks Bearing Formulas—It’s Often Pseudoscience

Pseudoscience based on data without theory and theory without data undermine the credibility of real science, which is the key to human progress

Elsewhere I have warned of the perils of making decisions based on data without theory. For example, the patterns discovered by data-mining computer algorithms are often nothing more than meaningless coincidences. It is also perilous to go to the opposite extreme—to make decisions based on theory without data.

Once upon a time, for example, economists were fond of sketching labor demand and supply curves and assuming that the economy was at their intersection. That is, labor demand is equal to supply, so that everyone who wants to work is working. The unemployed have chosen to be unemployed because they value leisure more than income. True believers were fond of this theory and little troubled by reality.

Between 1929 and 1933, national output in the United States fell by a third, and the unemployment rate rose from 3 percent to 25 percent. Behind these aggregate numbers were millions of private tragedies. One hundred thousand businesses failed and twelve million people lost their jobs, income, and self-respect. Many lost their life savings in the stock market crash and the tidal wave of bank failures. Without income or savings, people could not buy food, clothing, or medical care. Those who could not pay rent lost their shelter; those who couldn’t make mortgage payments lost their homes. Desperate people moved into shanty settlements (called Hoovervilles); slept under newspapers (Hoover blankets); and scavenged for food in dumps and dumpsters. This was not because they preferred Hoovervilles, Hoover blankets, and dumpster food to working.

British economist John Maynard Keynes (1883–1946) surveyed the economic carnage and concluded that the conventional theory was disastrously wrong:

It may well be that the classical theory represents the way in which we should like our economy to behave. But to assume that it actually does so is to assume our difficulties away. – General Theory

His treatise, The General Theory of Employment, Interest and Money, was published in 1936 and revolutionized economics. Indeed, it created a whole branch of economics called macroeconomics.

Decades later, the so-called New Classical Economists pushed back, once again arguing—against all evidence—that those who are unemployed choose to be unemployed. Their new wrinkle was that the unemployment rate ebbs and flows because workers are really bad at estimating the rate of inflation. The unemployment rate goes down when workers underestimate inflation. They consequently overestimate what their wages will buy and take jobs they later regret having taken. The unemployment goes up when workers overestimate inflation, underestimate what their wages will buy, and quit jobs they later regret leaving. No, I am not making this up.

The Great Depression didn’t happen because of 10 years of bad guesses about the rate of inflation. Yet, the New Classical models were embraced by many economists who valued theory over data.

The figure below shows the unemployment rate and the annual rate of inflation over the past 20 years. Did the 2001 recession, the Great Recession of 2007–2009, and the current COVID–19 recession happen because workers thought the rate of inflation was 3 percent when it was really only 2 percent? That ludicrous conclusion is what happens when one focuses on theory and neglects data.

New Classical enthusiasts advised central banks to ignore recessions and instead adopt stable monetary policies that keep inflation constant so that workers won’t make mistakes. We can be thankful that the Federal Reserve (the “Fed”) was unpersuaded. As horrible as the three most recent recessions have been, they would have been much, much worse if the Fed had been satisfied with low inflation and paid no attention to soaring unemployment.

Another example of valuing theory over data is the Club of Rome’s 1971 model that predicted that the world standard of living would peak in 1990 and then decline inexorably. Their doomsday manifesto was translated into 30 languages and sold more than 12 million copies. Their solution to the impending economic collapse was for world governments to reduce the world supply of food by 20 percent so that people would be forced to have fewer children, or starve. You might think they were joking. They weren’t. As William Nordhaus wrote, their prescription “would save the planet at the expense of its inhabitants.”

The Club of Rome model is all theory and no data. They simply assumed that the world’s supply of resources is fixed while the demand for resources grows at a compound rate—which guarantees that demand will exhaust resources (surprisingly soon with compound growth). They completely ignored the fact that humans have been endlessly creative in thinking of ways to substitute more plentiful resources for less plentiful ones. We have figured out how to use nuclear fuels and solar energy in place of fossil fuels, e-mail in place of snail-mail, and plastic in place of wood, metal, and glass. No one knows what substitutes we will invent in the future but we can be confident that there will be substitutes. The doomsday year of 1990 is now long past and the world economy continues to grow.

A third example, which I have written about elsewhere, is the mortgage meltdown that began in 2007 and was fueled by banks making bad decisions. The bad decisions were based on complex models that were not tested with data but, instead, based on convenient assumptions that few understood. As Warren Buffett once warned, “Beware of geeks bearing formulas.”

Now we have the COVID-19 pandemic and far too many examples of both data without theory and theory without data. We have data without theory when Vitamin D is touted as a cure, based on little more than the observation that the residents of Italy and Spain are thought to have relatively low levels of vitamin D and were initially hard hit by COVID-19. We have theory without data when Donald Trump touts hydroxychloroquine as a game changer when it is, in fact, largely untested and potentially dangerous.

Even more ominous is the rampant distrust of science. Many people gather in public places without masks or separation because they think they know better than respected immunologists. Some spread preposterous conspiracy theories that the virus was created and unleashed by Anthony Fauci or Bill Gates for some nefarious purpose.

Pseudoscience based on data without theory and theory without data undermines the credibility of real science, which is the key to human progress. If we are to continue to advance, we need both theory and data. We need real scientists who propose plausible theories that are rigorously tested with relevant data. And we need to understand and appreciate the difference between real science and pseudoscience—that theory without data and data without theory are not real science.

You may also enjoy these articles Gary Smith on how to understand claims about data:

Data mining: A plague, not a cure: It is tempting to believe that patterns are unusual and their discovery meaningful; in large data sets, patterns are inevitable and generally meaningless.


Love math and computers? Whoops, love can be blind. The Great Recession shows why we can’t afford to take computers’ output at face value.

Gary N. Smith

Senior Fellow, Walter Bradley Center for Natural and Artificial Intelligence
Gary N. Smith is the Fletcher Jones Professor of Economics at Pomona College. His research on financial markets statistical reasoning, and artificial intelligence, often involves stock market anomalies, statistical fallacies, and the misuse of data have been widely cited. He is the author of The AI Delusion (Oxford, 2018) and co-author (with Jay Cordes) of The Phantom Pattern (Oxford, 2020) and The 9 Pitfalls of Data Science (Oxford 2019). Pitfalls won the Association of American Publishers 2020 Prose Award for “Popular Science & Popular Mathematics”.

Beware of Geeks Bearing Formulas—It’s Often Pseudoscience