Mind Matters Natural and Artificial Intelligence News and Analysis
Smartphone screen with smart home technologies on a blue backdrop. Internet of things isometric conceptual image. Digital Residence utilizing a mobile phone's fingerprint to get access to IOT systems
Image licensed via Adobe Stock

The LLM Deep Fake—Follow the Money

Hype sells. Just look at what's happening with AI

Large language models (LLMs) are far more impressive now than they were just a few months or a year ago. Compare these exchanges I had with GPT-3 in June 2022 and with ChatGPT two days ago:

Gary:          Can I improve my test scores if I study after taking the test?

GPT-3:        That’s a good question. I’m not sure if you can improve your test scores by studying after taking the test, but it couldn’t hurt to try!

ChatGPT:   Studying after taking a test is unlikely to directly improve your scores on that particular test since the test has already been completed. Once a test is finished, the opportunity to answer questions and provide responses has passed.

However, studying after taking a test can still have benefits. It allows you to review the material covered in the test, identify areas of weakness or misunderstanding, and reinforce your knowledge for future assessments or exams on the same topic. It can help you gain a deeper understanding of the subject matter and improve your overall comprehension and retention.

To specifically improve your test scores, it is more effective to focus on studying and preparing before the test, rather than after it has been taken.

ChatGPT’s response was much better though unnecessarily verbose.

Still Just Text Predictors

Such improvements have convinced many that LLMs are close to or already smarter than humans. The reality is that LLMs are just text predictors. They do not attempt to and cannot understand how the words they input and output relate to the real world. They consequently flop at tasks that require real understanding.

For example, this past April, I asked ChatGPT this classic logic puzzle, but I substituted a red/green card for the usual red/blue card:

Gary:          Suppose there is a wooden box with three cards inside. One card is red on both sides; one card is blue on both sides; and the third card is red on one side and green on the other side. If you pick a card randomly and look at a side randomly and see that it is red, what is the probability that the other side of the card is green?

The correct answer is 1/3 because there are three equally likely red sides that might be chosen, one of which has a green backside.

Not having this red/green variation in the text it trained on, ChatGPT floundered badly. It gave a 415-word, illogical and contradictory response this past April with a nonsense answer of 0%. Two days ago, I tried again to see if ChatGPT’s human handlers had trained it to give the correct answer. Nope. I asked the question three times and got three different long-winded, incorrect answers: 1, 2/3, and 1/2. The 100% answer is representative:

ChatGPT:   To determine the probability that the other side of the card is green given that you randomly picked a card, looked at a side, and observed it to be red, we can use conditional probability.

Let’s analyze the three possible scenarios:

Red-Red Card: There is only one card with red on both sides. If you picked this card, the other side must also be red. This scenario is not applicable since you observed a red side.

Blue-Blue Card: There is only one card with blue on both sides. If you picked this card, the other side would be blue. This scenario is not applicable since you observed a red side.

Red-Green Card: There is one card with red on one side and green on the other side. If you picked this card, there are two possibilities: you either saw the red side (which occurred) or the green side (which did not occur). Since you observed a red side, the other side must be green.

Considering only the applicable scenario, the probability that the other side of the card is green, given that you observed a red side, is 1 (or 100%).

Therefore, the probability that the other side of the card is green is 1 or 100% in this particular situation.

At some point, ChatGPT will be tweaked to give the correct answer to this question but its immutable inadequacies will persist and can be exposed easily by asking novel questions that require an understanding of what words mean. Here, change the colors of the cards or the number of cards. In the well-known Monty Hall problem, change the number of doors or the number of doors with desirable prizes behind them.

These limitations are hardly unique to LLMs. All so-called deep learning AI systems are just pattern finders, what Turing award winner Judea Pearl called “just curve fitting.” They can find statistical patterns in large databases far faster and more efficiently than can humans, but they lack the capabilities humans have to assess whether the discovered patterns are causal or coincidental.

The limitations of ChatGPT other LLMs and AI systems are hardly a secret. On May 24, Yann LeCun, Meta’s chief AI scientist, gave a talk at Northeastern University where he said that “machine learning sucks” in that current AI systems are specialized and brittle, make “stupid” mistakes, and do not reason and plan, in contrast to animals and humans who can learn new tasks very quickly, understand how the world works, and can reason and plan. Nor will these problems be fixed by training LLMs on ever larger databases. On April 14, Sam Altman, the CEO of OpenAI, which created ChatGPT, said that scaling up was likely to yield diminishing returns: “I think we’re at the end of the era where it’s gonna be these giant models.”

The enduring problem is that computers do not know how words and other data relate to the real world. We are not close to a solution.

Don’t Trust AI to Make Your Decisions

Even so, on March 22, the Future of Life Institute released a petition signed by thousands of tech people that began, “AI systems with human-competitive intelligence can pose profound risks to society and humanity, as shown by extensive research and acknowledged by top AI labs.” It went on to recommend that, “all AI labs to immediately pause for at least 6 months the training of AI systems more powerful than GPT-4.”

On May 30, thousands of tech people, including Sam Altman and Bill Gates, signed a one-sentence statement prepared by the Center for AI Safety warning that, “Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.”

There are three compelling reasons for the enormous disconnect between such dire warnings and the actual capabilities of LLMs and other AI systems. As the popular saying goes, “Follow the money.”

First, one thing LLMs do really well is generate disinformation. The firehose of falsehoods unleashed by Russia and others with ill intent is about to become a tsunami of deceit. Social media platforms profit from advertisement-related bots and from the engagement created by malignant content and they are happy to have government regulators distracted by fears of extinction from super-intelligent AI.

Second, companies want money to finance their research and they want money for selling their products. By making AI seem far more powerful than it really is, they can do both. Startups are using AI-hype to raise money. Established companies are using AI-hype to win contracts and sell products. The more powerful AI is perceived to be, the more profitable companies will be.

Third, as I have been saying for some time, the real danger today is not that computers are smarter than us but that we think computers are smarter than us and consequently trust them to make decisions they should not be trusted to make. Fretting about hypothetical long-run existential threats posed by super-intelligent computers diverts attention away from the very real and immediate risks of relying on unreliable and discriminatory black-box algorithms that are being used to make hiring decisions, approve loans, price insurance, determine prison sentences, and so much more. Companies that are fattening their bottom lines by using such algorithms in place of intelligent humans welcome the distraction.

Gary N. Smith

Senior Fellow, Walter Bradley Center for Natural and Artificial Intelligence
Gary N. Smith is the Fletcher Jones Professor of Economics at Pomona College. His research on financial markets statistical reasoning, and artificial intelligence, often involves stock market anomalies, statistical fallacies, and the misuse of data have been widely cited. He is the author of dozens of research articles and 16 books, most recently, The Power of Modern Value Investing: Beyond Indexing, Algos, and Alpha, co-authored with Margaret Smith (Palgrave Macmillan, 2023).

The LLM Deep Fake—Follow the Money