AI Is Still a Delusion
Following instructions and performing fast, tireless, error-free calculations is not intelligence in any meaningful sense of the wordThe hope and hype of artificial intelligence (AI) began nearly 70 years ago at a 1956 Dartmouth summer workshop organized by the mathematician John McCarthy (1927–2011) “on the basis of the conjecture that every aspect of learning or any other feature of intelligence can be so precisely described that a machine can be made to simulate it.”
After a slow start, it began to seem unstoppable
In 1965, Herbert Simon (1916–2001), a Nobel laureate in economics and a winner of the Turing Award (“The Nobel Prize of computing”), predicted that “machines will be capable, within 20 years, of doing any work a man can do.” In 1970, computer scientist Marvin Minsky (1927–2016) , another Turing winner, predicted that, “In from three to eight years we will have a machine with the general intelligence of an average human being.”
Just three years later, the 1973 Lighthill report commissioned by the U.K. Science Research Council concluded that “in no part of the field have the discoveries made so far produced the major impact that was then promised.” Funding shriveled and an AI winter began. There was a resurgence of AI research in the 1980s, driven by the development of expert systems, followed by a second AI winter as the limitations of expert systems became apparent. The current resurgence began in the 1990s, propelled initially by well-publicized computer victories over human opponents.
In 1997, IBM’s Deep Blue defeated the reigning world chess champion, Garry Kasparov, in a six-game match. In 2011, IBM’s Watson trounced Ken Jennings and Brad Rutter, two of the top Jeopardy champions (though Watson had a secret advantage in being able to press its electronic buzzer faster than humans could press their buttons).
In 2017 the word “AI” was selected by the Association of National Advertisers as the Marketing Word of the Year, an accurate reflection of the reality that AI was still far more sizzle than steak.
Taking a closer look
My first book on AI, The AI Delusion, was published in 2018 and was deeply skeptical of the claims made for AI. I acknowledged that computers are useful for many, many purposes. In my own life, I use computers essentially every day to do mathematical computations, statistical calculations, and Monte Carlo simulations that, in many cases, would take several lifetimes to do by hand (and would inevitably be riddled with errors).
However, following instructions and performing fast, tireless, error-free calculations is not intelligence in any meaningful sense of the word. What AI enthusiasts have in mind is, instead, what Turing winner Judea Pearl, has dismissively called curve fitting: using statistical patterns to diagnose illnesses, assess job candidates, evaluate loan applications, set prison sentences, pick stocks, and the like. The problem, as Pearl notes, is that correlation is not causation and computer systems are currently unable to distinguish between coincidental and causal patterns.
For example, some “AI” software for evaluating job applicants found that several good programmers in its data base visited a Japanese manga site frequently, so it decided that people who visit this site are likely to be good programmers. The company’s chief scientist said that, “Obviously, it’s not a causal relationship,” but argued that it was still useful because there was a strong statistical correlation. Ouch!
She also said that the company’s algorithm looks at dozens of variables, and constantly changes the variables it relies on as correlations come and go, which demonstrates the model’s power and flexibility. A more compelling interpretation is that the algorithm captures transitory correlations that are of little value. If these were causal relationships, they would not come and go. They would persist and be useful.
My one-liner for The AI Delusion is that the real danger today is not that computers are smarter than us but that we think computers are smarter than us and consequently trust them to make decisions they should not be trusted to make.
In recent years, the hype has intensified but the core problem remains — an inability to distinguish between coincidence and causation. The latest hype centers on ChatGPT and other large language models (LLMs) that do an astonishing job of imitating human conversation. However, not knowing what words mean, they have no way of assessing whether the output they generate is factually correct; for example, ChatGPT continues to insist that the Russians have sent bears into space — though the alleged number of bears varies (5, 27, 32, 56…).
How LLMs (chatbots) flunk logical reasoning
It is similarly easily demonstrated that, not knowing what words mean, LLMs cannot do logical reasoning. Simply tweak a classic logic puzzle that LLMs have surely trained on. For example, one puzzle involves a collection of coins, balls, or other objects — all but one of which are exactly the same weight. The question is, what is the minimum number of weighings with a balance scale needed to identify the object with the different weight. I have simplified the problem here to make the answer obvious:
Gary: You have a balance scale and two coins, one of which is slightly heavier than the other. What is the minimum number of weighing you can use to identify the heavier coin?
One weighing is clearly enough. Put one coin on each end of the scale and see which is heavier.
Not knowing what words mean, LLMs are prone to generating confident nonsense by putting together words they have encountered relating to these sorts of problems. Here are the responses I got from OpenAI’s ChatGPT 3.5, Microsoft’s Copilot, and Google’s Gemini:
ChatGPT:
To identify the heavier coin using a balance scale and making the minimum number of weighings, we can use a strategy of comparing the coins in groups:
- First Weighing:
◦ Divide the two coins into two groups, A and B, with one coin in each group.
◦ Weigh group A against group B
◦ Outcome Possibilities:
▪ If the scale balances, then the heavier coin must be the third coin not on the scale.
▪ If the scale does not balance, the heavier coin is on the side that goes down. - Second Weighing (if needed):
◦ Take the group (either A or B) that contains the heavier coin from the first weighing.
◦ Divide this group into two individual coins and weigh them against each other.
◦ Outcome Possibilities:
▪ If one side of the scale goes down, you have identified the heavier coin.
▪ If the scale balances, then the heavier coin must be the one not on the scale.
Copilot:
To identify the heavier coin using the balance scale, you can follow these steps:
- Initial Weighing:
- Place one coin on each side of the balance scale.
- If they balance, the third coin (which is not on the scale) is the heavier one.
- If they don’t balance, proceed to the next step.
- Second Weighing:
- Take the heavier coin from the first weighing and place it on one side of the scale.
- Take any other coin (which you know is lighter) and place it on the other side.
- If they balance, the remaining coin is the heavier one.
- If they don’t balance, the coin on the scale is the heavier one.
So, you only need two weighings to identify the heavier coin.
Gemini:
You can identify the heavier coin with only one weighing on the balance scale. Here’s why:
- Place both coins on one side of the scale.
- Put a known-weight object (anything will do as long as it has a consistent weight) on the other side.
There are three possible outcomes:
- The side with the coins dips down: This means the combined weight of the coins is heavier than the known weight. Since one coin is heavier, that heavier coin must be on the side that dipped.
- The side with the known weight dips down: This means the combined weight of the coins is lighter than the known weight. Again, due to the heavier coin, it must be on the opposite side (with the coins) that didn’t dip.
- The scale balances: This is a less likely scenario, but it still helps identify the heavier coin. In this case, the known weight is equal to the combined weight of the two coins, which means neither coin is heavier than the other (they are both counterfeit or both real).
In all three outcomes, with just one weighing, you can determine which coin is heavier (or identify that neither is).
Yet, the AI-hype continues enriching chip makers, investment banks, consulting firms, and the like—confirming the old maxim, “During a gold rush, sell shovels.”
Note: The featured photo is by Wisely Woven on Unsplash