Mind Matters Natural and Artificial Intelligence News and Analysis
robotics-or-ai-artificial-intelligence-connecting-interactio-626535089-stockpack-adobe_stock
Robotics or ai artificial intelligence connecting interaction with human.Chat bot software network.big data and transfer protocol system.Neuralink with smart brain.ai generative technology

Why LLMs (chatbots) Won’t Lead to Artificial General Intelligence

The biggest obstacle is seldom discussed: Most consequential real-world decisions involve uncertainty
Share
Facebook
Twitter
LinkedIn
Flipboard
Print
Email

The shameless exaggerations surrounding large language models (LLMs) would be laughable if LLMs were not diverting so much energy, money, and talent into a seemingly bottomless pit.

In October 2024, Elon Musk declared that:

I certainly feel comfortable saying that it’s getting 10 times better per year… I think it will be able to do anything that any human can do possibly within the next year or two.

In November 2024, OpenAI’s Sam Altman predicted the arrival of AGI in 2025. A year earlier, in October 2023, Blaise Agüera y Arcas and Peter Novig wrote a piece titled “Artificial General Intelligence Is Already Here.”

LLM cheerleaders claim that the effects will be immense. Wharton Professor Ethan Mollick asserted that the productivity gains from LLMs might be larger than the gains from steam power. Sundar Pichai, CEO of Alphabet and Google, proclaimed that, LLMs are “more profound than fire” and Turing winner Geoffrey Hinton declared, “I think it’s comparable in scale with the Industrial Revolution or electricity— or maybe the wheel.”

It is now clear that this is self-serving hyperbole. Plain-vanilla LLMs will not lead to AGI because they do not understand the text they input and output or how this text relates to the real world. They consequently cannot distinguish between fact and fiction or between correlation and causation— let alone engage in critical thinking. They are consequently prone to hallucination and flub simple questions. Pre-training on larger and larger databases won’t solve this problem.

A new focus on post-training

This realization has led many researchers to focus on post-training to clean up the mistakes LLMs make. This post-training is reminiscent of the expert-system models that were once thought to be the road to AGI. Expert systems contain domain-specific facts and rules crafted by human experts. Such models can be very useful when applied to well-defined domains with concrete conclusions that can be reached via flowchart branches. For example, TurboTax and other tax-preparation software will usually (but not always) generate correct answers and do so far more efficiently than relying on human tax preparers to complete tax returns.

However, expert systems have well-known limitations that also constrain the ability of post-training to lead LLMs to AGI:

  • The knowledge base must be correct and, in some domains, updated frequently.
  • The programs can be extremely large and expensive to set up and maintain.
  • Expert systems do not yield reliable answers to questions that were not anticipated.

The biggest obstacle, however, is quite different and seldom discussed.

Most consequential real-world decisions involve uncertainty

A depiction of a landscape inspired by the uncertainty principle, with shapes that appear to shift and blur.

Should a presidential campaign spend more money in Ohio or Pennsylvania? Should I buy Apple, Bank of America, or Coca-Cola stock? Should a company hire Andrea, Barry, Chantal, or Dexter? Should I settle this legal case or go to trial? When should I begin collecting Social Security?

Rational decisions require a consideration of the likelihood of various outcomes. But, as we can see from these examples, there are often no objectively correct probabilities that humans or LLMs might calculate from mathematical formulas or estimate from historical data. Instead, we have to rely on subjective probabilities based on our interpretation and weighing of a wide array of information, and try to make decisions that are consistent with these personal beliefs. There is no conceivable way that an LLM, with or without an expert system grafted on, can generate subjective probabilities consistent with user beliefs.

An LLM might ask users to specify probabilities, but that wouldn’t be AGI, would it? In addition, we may find it difficult to specify precise probabilities. Our brains are so amazing and enigmatic that we often cannot articulate precisely how we reached a specific conclusion or made a particular decision.

To take economics as an example, the great British economist John Maynard Keynes (1883–1946) wrote that

the master-economist must possess a rare combination of gifts. He must be mathematician, historian, statesman, philosopher—in some degree. He must understand symbols and speak in words. He must contemplate the particular in terms of the general, and touch abstract and concrete in the same flight of thought. He must study the present in the light of the past for the purposes of the future. No part of mans nature or his institutions must lie entirely outside his regard.

With or without expert post-training, LLMs cannot blend relevant information and insights from such diverse sources.

Paul Samuelson (1915–2009) wrote about one master economist:

When Robert Adams wrote an MIT thesis on the accuracy of different forecasting methods, he found that ‘being Sumner Slichter’ was apparently one of the best methods known at that time. This was a scientific fact, but a sad scientific fact. For Slichter could not and did not pass on his art to an assistant or to a new generation of economists.

Sumner Slichter’s reason could not be boiled down to formulas and flow charts.

Currently, Ed Yardeni is one of the very best economic forecasters. He puts probabilities on various plausible scenarios, most recently, a 55% chance of a Roaring 2020s, a 25% chance of a Meltup, and a 20% chance of Stagflation. These probabilities were not derived from mathematical equations or estimated from historical data but came from Yardeni’s consideration and assessment of a vast amount of current information, taking into account his long experience in predicting financial markets. When conditions change, he changes his probabilities.

We shouldn’t expect LLMs to match Keynes, Slichter, or Yardeni. But we should insist that anything labeled AGI makes predictions and recommendations that take into account the wide variety of currently relevant information and assess the importance and likelihood of plausible events and outcomes the way that humans do. LLMs that rely on nothing more than their pre-training obviously cannot do this. Nor is it conceivable that an expert system could anticipate the timely information that should be used to make decisions, let alone weigh uncertainties subjectively and make choices and recommendations that take these into account.

Increasing the scale of the pre-training or the extent of the post-training will not give LLMs the cognitive abilities of humans— not this year, not next year, perhaps never.


Gary N. Smith

Senior Fellow, Walter Bradley Center for Natural and Artificial Intelligence
Gary N. Smith is the Fletcher Jones Professor of Economics at Pomona College. His research on stock market anomalies, statistical fallacies, the misuse of data, and the limitations of AI has been widely cited. He is the author of more than 100 research papers and 18 books, most recently, Standard Deviations: The truth about flawed statistics, AI and big data, Duckworth, 2024.

Why LLMs (chatbots) Won’t Lead to Artificial General Intelligence