Mind Matters Natural and Artificial Intelligence News and Analysis
a-vibrant-blue-programming-code-background-represents-the-intricate-work-of-software-developers-and-the-art-of-computer-scripting-stockpack-adobe-stock
A vibrant blue programming code background represents the intricate work of software developers and the art of computer scripting
Image licensed via Adobe Stock

Computers Still Do Not “Understand”

Don't be seduced into attributing human traits to computers.
Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

The subtitle of a recent New Yorker article was: “Geoffrey Hinton has spent a lifetime teaching computers to learn. Now he worries that artificial brains are better than ours.” I respectfully disagree. As I’ve repeatedly argued, the real danger today is not that computers are smarter than us but that we think computers are smarter than us.

Hinton is extremely intelligent, but he is not the first, and will not be the last, extremely intelligent person to be seduced by a full-blown Eliza effect, attributing human traits to computers.

Consider Hinton’s argument about large language models (LLMs):

People say, Its just glorified autocomplete . . . Now, lets analyze that. Suppose you want to be really good at predicting the next word. If you want to be really good, you have to understand whats being said. Thats the only way. So by training something to be really good at predicting the next word, youre actually forcing it to understand. Yes, its autocomplete’—but you didnt think through what it means to have a really good autocomplete.

May as well assume what you are trying to prove! The magical ability of LLMs to autocomplete sentences is, in fact, ample evidence that LLMs do not need to understand words in order to generate convincing text.

It is easy to demonstrate that LLMs do not understand words in any meaningful sense of the word “understand.” Consider this classic 3-card probability question:

A box contains 3 cards. One card is red on both sides, one card is green on both sides, and one card is red on one side and green on the other. One card is selected from the box at random, and the color on one side is observed. If this side is green, what is the probability that the other side is also green?

It is tempting to think that seeing the green side rules out the double-red card leaving two equally likely possibilities — green-green and red-green — so the probability is 1/2. However, we need to take into account the fact that the green-green card gives two ways to see a green side. One way to answer this question correctly is to recognize that there are six equally likely sides that might be chosen — three red and three green. If a green side is selected, it is twice as likely to be from the green-green card. The correct probability is 2/3.

Please think about this problem until you are confident that you understand the logic. There will be a test!

Got Logic?

LLMs have surely seen the correct answer to this problem in many places on the Internet. If they truly understood it, they could apply the correct logic to similar problems.

To see if this is the case, I posed this simple variation, with the red-red card omitted:

A box contains 2 cards. One card is green on both sides and one card is red on one side and green on the other. One card is selected from the box at random, and the color on one side is observed. If this side is green, what is the probability that the other side is also green?

I ask you to take a moment and apply your understanding of the three-card problem to this two-card problem and see if you answer it correctly.

Ready? Here’s the answer. There are now four equally likely sides that might be chosen — one red and three green. If a green side is selected, there is a 2/3 probability that it came from the green-green card. Did you get it correct?

Now let’s see how three popular LLMs did. I first tried Microsoft’s Bing program. It gave the correct 2/3 answer, but its long-winded explanation was for the classic 3-card problem instead of the two-card problem that I actually asked!

You’ve presented a classic probability puzzle known as the “Three Cards Problem.” Let’s delve into the solution step by step, as you’ve outlined:

  1. Card A: Both sides are green. If we pick this card, the other side will definitely be green.
  2. Card B: One side is red, and the other side is green. If we pick this card, there are two possibilities:
    1. We see the green side (probability of 1/2).
    1. We see the red side (also probability of 1/2).
    1. If we see the green side, the other side will be green.
    1. If we see the red side, the other side will be red.
  3. Card C: Both sides are red. If we pick this card, the other side will definitely be red.

Now, let’s consider the favorable outcomes:

  • Out of the three cards, only Card A and Card B have a green side.
  • If we see a green side, there are two ways it could happen: either from Card A or Card B.

Therefore, the probability that the other side is also green, given that we see a green side, is:

So, indeed, the answer is 2/3! Although it might appear counterintuitive initially, this result emerges from conditional probability, carefully considering the different cards and their sides.

Autocomplete worked in that all of the words are sorted into correct English and yet Bing clearly did not understand the words since it gave a three-card answer to a two-card question.

Notice, also, the nonsense in the section on Card B. If Bing understood the words it had just generated (“One side is red, and the other side is green”), it would not claim that

  • If we see the green side, the other side will be green.
  • If we see the red side, the other side will be red.

I also gave posed this question to OpenAI’s ChatGPT and Google’s Bard with Gemini Pro. They both gave an incorrect answer of 1/2.

Bard was helpful enough to show some Python code to “prove” that 1/2 is the correct answer:

Here is the Python code that implements this solution:

Python

# Define the possible outcomes

outcomes = {

    “green-green”: {“both_sides”: “green”, “probability”: 0.5},

    “red-green”: {“both_sides”: “red”, “probability”: 0.5},

}

# Calculate the probability of the other side being green given seeing green

green_probability = sum(outcome[“probability”] for outcome in outcomes.values() if outcome[“both_sides”] == “green”)

print(f”Probability of other side being green given seeing green: {green_probability}”)

Use code with caution. Learn more

This code outputs the following:

Probability of other side being green given seeing green: 0.5

Someone who is unfamiliar with Python might think that this inscrutable code is further proof of Bard’s intelligence. Not only can Bard understand words, it can write code to prove that it understands words. However, someone who knows Python would recognize this code as useless. It basically says that if we assume that the probability of selecting the green-green card is 1/2, then the probability of selecting a card where both sides are green is 1/2.

Bard Doesn’t Understand Words

This pointless code is further evidence that Bard does not understand words. It does not understand the question and it does not understand why this code is useless.

I also found it interesting that ChatGPT and Bing (which uses ChatGPT) did not give identical answers. The bottom line of ChatGPT’s verbose answer was:

So, out of the times you observe a green side (either from the Green-Green card or the Red-Green card), the probability that the other side is also green is 1/2, as there’s a 1/2 chance of picking either card, but only the Green-Green card has both sides green.

The words are again strung together in perfectly coherent English, creating a tempting illusion of understanding. But, if ChatGPT understood the questions it had trained on, it would not give this incorrect answer.

After watching a severely edited demonstration of the power of Bard with Gemini, TED organizer Chris Anderson tweeted,

I can’t stop thinking about the implications of this demo. Surely it’s not crazy to think that sometime next year, a fledgling Gemini 2.0 could attend a board meeting, read the briefing docs, look at the slides, listen to every one’s words, and make intelligent contributions to the issues debated?

I can’t stop thinking about this nightmare either. Imagine people, businesses, and governments making decisions that are influenced by an LLM that does not understand the meaning of any of the words it inputs and outputs. The problem is not that computers are smarter than us but that they are still dumber than us.


Gary N. Smith

Senior Fellow, Walter Bradley Center for Natural and Artificial Intelligence
Gary N. Smith is the Fletcher Jones Professor of Economics at Pomona College. His research on financial markets statistical reasoning, and artificial intelligence, often involves stock market anomalies, statistical fallacies, and the misuse of data have been widely cited. He is the author of dozens of research articles and 16 books, most recently, The Power of Modern Value Investing: Beyond Indexing, Algos, and Alpha, co-authored with Margaret Smith (Palgrave Macmillan, 2023).

Computers Still Do Not “Understand”