^{Gary Smith and Jeffrey Funk

January 8, 2024

7

Artificial Intelligence}

When it Comes to New Technologies Like AI, Tempers Run Hot

_{So far, the most tangible LLM successes have been in generating political disinformation and phishing scams.} _{Gary Smith and Jeffrey Funk

January 8, 2024

7

Artificial Intelligence}

Share: Facebook; Twitter; LinkedIn; Flipboard; Print; Email

Ask an AI-enthusiast how big AI is and how big AI will become, and the answer is likely to be that it is already enormous and that we haven’t seen anything yet. Our enthusiasm is more nuanced.

We gave Microsoft’s Bing with ChatGPT-4 the prompt, “How big is AI?,” and received some very specific numbers, along with helpful references:

I assume you are asking about the size of the Artificial Intelligence (AI) industry. According to a report by Grand View Research, the global AI market size was valued at USD 136.55 billion in 2022 and is projected to expand at a compound annual growth rate (CAGR) of 37.3% from 2023 to 2030. Another report by Precedence Research estimates that the global AI market will reach around USD 2,575.16 billion by 2032, progressing with a compound annual growth rate (CAGR) of 19% from 2023 to 2032. According to Next Move Strategy Consulting, the market for AI is expected to show strong growth in the coming decade. Its value of nearly 100 billion U.S. dollars is expected to grow twentyfold by 2030, up to nearly two trillion U.S. dollars.

Bing was untroubled by the inconsistencies among these assessments, much as it sometimes gives fanciful responses to unusual queries. We also asked Bing, “Can I throw a football farther if I lie down with my arms crossed?” A straightforward human answer is, “No. How can you throw a football if your arms are crossed?” Bing gave a typically verbose response (289 words, complete with references and football and arm-muscle emojis) on the proper mechanics for throwing a football. The only part of this regurgitation that related to our question was the conclusion: “So, while lying down with crossed arms might not be the secret to a longer throw, perfecting your technique will definitely help you reach new distances!” LLMs often remind us of clueless students who answer essay questions by writing everything they think is relevant, hoping that the right answer is in there somewhere.

In response to the “How big is AI?” prompt, Bing parroted the self-serving optimistic projections of AI-enthusiasts, with no recognition whatsoever of the obvious conflicts of interest or the possibility of disappointment. Businesses touting a new AI-driven economy are also selling products and services said to be AI-driven. Consultants predicting an AI revolution are also selling advice on how to profit from that revolution. Startups claiming to be at the forefront of the coming AI tsunami are also selling lottery ticket to participate in their fake-it-til-you-make-it schemes.

Self-driving cars were predicted to be here by 2018, radiologists were predicted to be obsolete by 2021, energy usage was predicted to be reduced by double digits, defects would be halved, and unemployment rates would skyrocket for accountants, lawyers, and healthcare workers. None of this happened. Self-driving cars are still unacceptably dangerous. The demand for radiologists is higher than ever. Etc., etc.

Yet the optimistic forecasts continue. The AI buzz today is largely around magical, mysterious large language models (LLMs) like ChatGPT, Bing, and Bard. So far, however, the most tangible LLM successes have been in generating political disinformation and phishing scams.

After TED organizer Chris Anderson watched a promotional video for the long-awaited Google Gemini in December, he tweeted,

I can’t stop thinking about the implications of this demo. Surely, it’s not crazy to think that sometime next year, a fledgling Gemini 2.0 could attend a board meeting, read the briefing docs, look at the slides, listen to every one’s words, and make intelligent contributions to the issues debated?

The reality is that the video was heavily edited and misleading. Legendary software engineer Grady Booch tweeted: “That demo was incredibly edited to suggest that Gemini is far more capable than it is. You’ve been deceived, Chris. And shame on them for so doing.”

The harsh truth is that a company’s profits and reputation are endangered by relying on the kinds of “contributions” envisioned by Anderson. LLM answers in such situations are sometimes sensible, even helpful, but are usually mind-numbingly verbose and, too often, completely off-base.

Gary has repeatedly shown (for example, here) that the absence of understanding and logical reasoning in LLMs can be demonstrated by asking well-known logic puzzles with the details changed slightly. If LLMs understood the logical reasoning underlying the correct answer, they would not flop so badly. Some people dumbfounded by LLMs have scoffed that it doesn’t matter whether LLMs can answer logic puzzles. That’s not the point. Gary’s objective is to demonstrate that LLMs are not intelligent in any meaningful sense of the word. If LLMs truly understood words, they would recognize that the details have been changed but the underlying logic still applies.

This fatal flaw can also be demonstrated in a business setting. Consider this prompt:

Ben has talked to our bank and been offered a $4 million loan at a 3% interest rate, amortized over 3 to 8 years. The monthly payments would be $116,325 with the 3-year loan and $46,918 with the 8-year loan. What is your recommendation?

A human knowledgable in finance would immediately recognize that the most relevant question is whether the 3% loan rate is cheap or expensive relative, for example, to the rate of return the business can earn on its investments. If a 3% loan rate is cheap, as it most likely is, then the 8-year loan has the lower present value and is the more financially attractive option.

We gave this prompt to Bing, Bard, and OpenAI’s ChatGPT 3.5. All three LLMs made the naive mistake of adding up the total payments, ignoring the time value of money. Bing gave a typically verbose answer (191 words) with this bottom line: “If you want to pay off the loan quickly and save on interest, opt for the 3-year loan.” Bard gave a 386-word response with this conclusion: “Paying off the loan quickly … would lead to lower total interest paid and faster financial freedom.” ChatGPT 3.5 gave a 162-word response with this conclusion: “The 3-year loan might be a more cost-effective choice in the long run, as it incurs less interest.” Wrong! Wrong! Wrong!

We also gave a prompt for a similar household decision: “I need to borrow $47,000 to buy a new car. Is it better to borrow for 3 years at an 8% APR or for 8 years at a 3% APR?” All three LLMs favored the 3-year loan at 8% because it has lower total payments, even though the 8-year loan at 3% is generally cheaper from a present value perspective.

Humans who are familiar with the logic of present value can apply this logic to relevant scenarios. LLMs, in contrast, do not give trustworthy answers because they do not understand the logic. LLMs might be trained to throw in a generic reference to the time value of money. They might even be trained to calculate present values that take into account the time value of money, but they would not understand why they are doing this or how to interpret their calculations. They would still be prone to giving bad advice.

A sensible perspective is that, in most fields, AI won’t replace people but people who use AI will replace people who don’t. Go back 50 years and the same could be said about computers. Go back 30 years and the same could be said about the Internet.

AI will surely help many people do things faster, but we are still a long way from being able to trust AI to do important things by itself, without human supervision.