On March 22, nearly 2,000 people signed an open letter drafted by the Future of Life Institute (FLI) calling for a pause of at least 6 months in the development of large language models (LLMs):
Contemporary AI systems are now becoming human-competitive at general tasks, and we must ask ourselves: Should we let machines flood our information channels with propaganda and untruth? Should we automate away all the jobs, including the fulfilling ones? Should we develop nonhuman minds that might eventually outnumber, outsmart, obsolete and replace us? Should we risk loss of control of our civilization?
FLI is a nonprofit organization concerned with the existential risks posed by artificial intelligence. Its president is Max Tegmark, an MIT professor who is no stranger to hype. In December 2020, Tegmark made headlines with reports that he and an MIT student had created a neural-network algorithm, dubbed AI Feynman, that rediscovered 100 physics equations from the legendary textbook, The Feynman Lectures on Physics. Tegmark gushed that, “We’re hoping to discover all kinds of new laws of physics. We’re already shown that it can rediscover laws of physics.” The reality was quite different. AI Feynman didn’t actually rediscover any laws of physics. All it did was curve-fit data generated by each of the 100 equations.
The fears expressed by the recent open letter are also grossly exaggerated. Indeed, it seems very much like a fake-it-til-you-make-it publicity stunt that is intended to persuade customers, businesses, and investors that AI is far more powerful than it really is.
We agree with the letter’s first enumerated fear: ChatGPT and other LLMs will flood the internet with a firehose of falsehoods. The only possible good that can come of this tsunami of disinformation is that people will stop believing what they see on the internet. Our concern here is with the second fear: jobs are about to be automated away.
The idea that machines will soon be capable of doing the work of humans has been with us for more than 50 years, including Nobel laureate Herbert Simon’s 1965 prediction that “machines will be capable, within 20 years, or doing any work a man can do.” More recently we have Daniel Susskind’s 2020 award-winning book, A World Without Work. We have written elsewhere about how fears of job-killing robots are overblown.
One study we analyzed was a 2016 Oxford University report that used the U.S. Department of Labor’s O∗NET database to assess the importance of various skill competencies for hundreds of occupations. For example, on a scale of 0 to 100 the importance of finger dexterity was considered to be 81 for dentists, 72 for locksmiths, and 60 for barbers. The Oxford researchers then coded each of 70 occupations as either automatable or not and correlated these yes/no assessments with O*NET’s scores for nine skill categories. Using these statistical correlations, the researchers then estimated the probability of computerization for 702 occupations. The whole exercise was fanciful and we concluded that, “you may be in for a surprise if you trust a robot to cut your hair simply because it can open and close scissors.”
The recent hullabaloo over ChatGPT has revived fears of a computer takeover of human jobs. A recent study by researchers at OpenAI, OpenResearch, and the University of Pennsylvania concluded that “around 80% of the U.S. workforce could have at least 10% of their work tasks affected by the introduction of LLMs, while approximately 19% of workers may see at least 50% of their tasks impacted.”
Even though their focus is on LLMs, which are mere text generators, they use the same inadequate O∗NET database that was used by the Oxford researchers in 2016. As Albert Einstein said, insanity is doing the same thing over and over again and expecting different results.
The new study uses the O*NET database of information on 1,016 occupations, involving 19,265 tasks, and 2,087 Detailed Work Activities (DWAs) that are associated with tasks. Whew! Humans and LLMs(!) were used to assess whether an LLM or LLM-powered software can “reduce the time required for a human to perform a specific DWA or complete a task by at least 50 percent.” The linkages between occupations and tasks and DWAs were then used to assess the potential impact of LLMs on occupations.
To their credit, they admit that,
A fundamental limitation of our approach lies in the subjectivity of the labeling. In our study, we employ annotators who are familiar with LLM capabilities. However, this group is not occupationally diverse, potentially leading to biased judgments regarding LLMs’ reliability and effectiveness in performing tasks within unfamiliar occupations.
That is certainly a limitation!
Another fundamental limitation, which we noted in our earlier critique of the Oxford study, is that, “Some important skills are difficult to measure; others may be overlooked. For example, a robot with excellent finger dexterity won’t be a good dentist if its image-recognition software can’t recognize cavities infallibly.”
This new study’s conclusions confirm our fears. As with the Oxford study, barbers are considered at risk—but now from LLMs, finger dexterity be damned! This is hardly the only surprising result. The authors assert that occupations requiring critical thinking skills “are less likely to be impacted by current LLMs” but nonetheless conclude that, grouping occupations by required educational level, jobs that require master’s degrees or higher have the highest risk of being downsized. We doubt that LLMs are about to replace doctors and lawyers. We hope they are not about to replace professors—at least those professors who help students develop critical thinking abilities.
The report specifically identifies doctors, lawyers, mathematicians, financial quantitative analysts, central bank monetary authorities, managers of companies and enterprises, and tax preparers as among the occupations most at risk. What a wonderful example of “biased judgments regarding LLMs’ reliability and effectiveness in performing tasks within unfamiliar occupations.” It may be news to these researchers but each of these occupations does require substantial critical thinking skills. It was recently reported, for example, that when ChatGPT was asked tax questions posted on TaxBuzz’s tax practitioner technical support forum, the LLM got every single question wrong.
The Problem With “Jobs-at-Risk” Studies
There are many problems with jobs-at-risk studies. One is that specific skills are parts of jobs, not the jobs. Workers do a variety of tasks, often in conjunction with other workers, and identifiable skills may be just a small part of the overall job. Another problem is that these studies neglect of the cost-benefit calculations made by employers. Technology is more likely to replace workers when the benefits are clearly much larger than the costs. LLMs may seem relatively cheap to operate but the hidden costs are the consequences of the mistakes that LLMs are prone to make. Making a bad movie recommendation is one thing. Getting in trouble with the IRS, losing a winnable legal case, making an incorrect medical diagnosis, or causing an avoidable recession with a mistaken monetary policy is something else. Sam Altman, CEO of OpenAI, the creator of ChatGPT, recently tweeted that “ChatGPT is incredibly limited, but good enough at some things to create a misleading impression of greatness. It’s a mistake to be relying on it for anything important right now.”
Critical thinking is the key. LLMs are text generators—nothing more. They are astonishingly good at that but they are not designed to assess the veracity of the text they input and output. Nor are they intended to use any sort of critical thinking, which is why jobs that require critical thinking are not about to be replaced by LLMs.