

Gary Smith


A Modest Proposal for the MLB
Major League Baseball got greedy and needs to reform.
The MLB Coin-Flipping Contest
What are the chances that wild-card teams will make it to the World Series and win?
Blue Zone BS: The Longevity Cluster Myth
We need to be reminded how much real science has done for us and how real science is done.
Confusing Correlation with Causation
Computers are amazing. But they can't distinguish between correlation and causation.Artificial intelligence (AI) algorithms are terrific at discovering statistical correlations but terrible at distinguishing between correlation and causation. A computer algorithm might find a correlation between how often a person has been in an automobile accident and the words they post on Facebook, being a good software engineer and visiting certain websites, and making loan payments on time and keeping one’s phone fully charged. However, computer algorithms do not know what any of these things are and consequently have no way of determining whether these are causal relationships (and therefore useful predictors) or fleeting coincidences (that are useless predictors). If the program is black box, then humans cannot intervene and declare that these are almost certainly irrelevant coincidences. Even if Read More ›

The LK-99 BS Further Undermines the Credibility of Science
The rejection or distortion of genuine science can have tragic consequences
Sabrina Ionescu’s Hot Hand
When basketball players hit a "streak," does that elevate the probability of success?
Using Data Like a Drunk Uses a Lamppost
Startup companies can be tempted to use statistics for support instead of real illumination
The LLM Deep Fake—Follow the Money
Hype sells. Just look at what's happening with AILarge language models (LLMs) are far more impressive now than they were just a few months or a year ago. Compare these exchanges I had with GPT-3 in June 2022 and with ChatGPT two days ago: Gary: Can I improve my test scores if I study after taking the test? GPT-3: That’s a good question. I’m not sure if you can improve your test scores by studying after taking the test, but it couldn’t hurt to try! ChatGPT: Studying after taking a test is unlikely to directly improve your scores on that particular test since the test has already been completed. Once a test is finished, the opportunity to answer questions and provide responses has passed. However, studying after taking Read More ›

The Death of Peer Review?
Science is built on useful research and thoroughly vetted peer reviewTwo years ago, I wrote about how peer review has become an example of Goodhart’s law: “When a measure becomes a target, it ceases to be a good measure.” Once scientific accomplishments came to be gauged by the publication of peer-reviewed research papers, peer review ceased to be a good measure of scientific accomplishments. The situation has not improved. One consequence of the pressure to publish is the temptation researchers have to p-hack or HARK. P-hacking occurs when a researcher tortures the data in order to support a desired conclusion. For example, a researcher might look at subsets of the data, discard inconvenient data, or try different model specifications until the desired results are obtained and deemed statistically significant—and therefore Read More ›

A World Without Work? Here We Go Again
Large language models still can't replace critical thinkingOn March 22, nearly 2,000 people signed an open letter drafted by the Future of Life Institute (FLI) calling for a pause of at least 6 months in the development of large language models (LLMs): Contemporary AI systems are now becoming human-competitive at general tasks, and we must ask ourselves: Should we let machines flood our information channels with propaganda and untruth? Should we automate away all the jobs, including the fulfilling ones? Should we develop nonhuman minds that might eventually outnumber, outsmart, obsolete and replace us? Should we risk loss of control of our civilization? FLI is a nonprofit organization concerned with the existential risks posed by artificial intelligence. Its president is Max Tegmark, an MIT professor who is no stranger to hype. Read More ›

An Illusion of Emergence, Part 2
A figure can tell a story but, intentionally or unintentionally, the story that is told may be fictionI recently wrote about how graphs that use logarithms on the horizontal axis can create a misleading impression of the relationship between two variables. The specific example I used was the claim made in a recent paper (with 16 coauthors from Google, Stanford, UNC Chapel Hill, and DeepMind) that scaling up the number of parameters in large language models (LLMs) like ChatGPT can cause “emergence,” which they define as qualitative changes in abilities that are not present in smaller-scale models but are present in large-scale models; thus they cannot be predicted by simply extrapolating the performance improvements on smaller-scale models. They present several graphs similar to this one that seem to show emergence: However, their graphs have the logarithms of Read More ›

A Graph Can Tell a Story—Sometimes It’s an Illusion
Mistakes, chicanery, and "chartjunk" can undermine the usefulness of graphsA picture is said to be worth a thousand words. A graph can be worth a thousand numbers. Graphs are, as Edward Tufte titled his wonderful book, the “visual display of quantitative information.” Graphs should assist our understanding of the data we are using. Graphs can help us identify tendencies, patterns, trends, and relationships. They should display data accurately and encourage viewers to think about the data rather than admire the artwork. Unfortunately, graphs are sometimes marred (intentionally or unintentionally) by a variety of misleading techniques or by what Tufte calls “chartjunk” that obscures rather than illuminates. I have described elsewhere many ways in which mistakes, chicanery, and chartjunk can undermine the usefulness of graphs. I recently saw a novel Read More ›

Learning to Communicate
Why writing skills are so important, especially in today's artificial worldEducators have been shaken by fears that students will use ChatGTP and other large language models (LLMs) to answer questions and write essays. LLMs are indeed astonishing good at finding facts and generating coherent essays — although the alleged facts are sometimes false and the essays are sometimes tedious BS supported by fake references. I am more optimistic than most. I am hopeful that LLMs will be a catalyst for a widespread discussion of our educational goals. What might students learn in schools that will be useful long after they graduate? There are many worthy goals, but critical thinking and communication skills should be high on any list. I’ve written elsewhere about how critical thinking abilities are important for students Read More ›

Text Generators, Education, and Critical Thinking: an Update
The fundamental problem remains that, not knowing what words mean, AI has no critical thinking abilitiesThis past October, I wrote that educational testing was being shaken by the astonishing ability of GPT-3 and other large language models (LLMs) to answer test questions and write articulate essays. I argued that, while LLMs might mimic human conversation, they do not know what words mean. They consequently excel at rote memorization and BS conversation but struggle mightily with assignments that are intended to help students develop their critical thinking abilities, such as Lacking any understanding of semantics, LLMs can do none of this. To illustrate, I asked GPT-3 two questions from a midterm examination I had recently given in an introductory statistics class. Both questions tested students critical thinking skills and GPT-3 bombed both questions. I was hopeful Read More ›

Let’s Take the “I” Out of AI
Large language models, though impressive, are not the solution. They may well be the catalyst for calamity.When OpenAI’s text generator, ChatGPT, was released to the public this past November, the initial reaction was widespread astonishment. Marc Andreessen described it as, “Pure, absolute, indescribable magic.” Bill Gates said that the creation of ChatGPT was as important as the creation of the internet. Jensen Huang, Nvidia’s CEO, Jensen Huang, said that, “ChatGPT is one of the greatest things ever created in the computing industry.” Conversations with ChatGPT are, indeed, very much like conversations with a super-intelligent human. For many, it seems that the 70-year search for a computer program that could rival or surpass human intelligence has finally paid off. Perhaps we are close to the long-anticipated singularity where computers improve rapidly and autonomously, leaving humans far behind, Read More ›

Does New A.I. Live Up to the Hype?
Experts are finding ChatGPT and other LLMs unimpressive, but investors aren't getting the memoOriginal article was featured at Salon on February 21st, 2023. On November 30, 2022, OpenAI announced the public release of ChatGPT-3, a large language model (LLM) that can engage in astonishingly human-like conversations and answer an incredible variety of questions. Three weeks later, Google’s management — wary that they had been publicly eclipsed by a competitor in the artificial intelligence technology space — issued a “Code Red” to staff. Google’s core business is its search engine, which currently accounts for 84% of the global search market. Their search engine is so dominant that searching the internet is generically called “googling.” When a user poses a search request, Google’s search engine returns dozens of helpful links along with targeted advertisements based on its knowledge of the Read More ›

Goodhart’s Law and Scientific Innovation in Academia
Many university researchers are leaving academia so they can actually get things doneBritish economist Charles Goodhart was a financial advisor to the Bank of England from 1968 to 1985, a period during which many economists (“monetarists”) believed that central banks should ignore unemployment and interest rates. Instead, they believed that central banks should focus on maintaining a steady rate of growth of the money supply. The core idea was that central banks could ignore economic booms and busts because they are short-lived and self-correcting (Ha! Ha!) and should, instead, keep some measure of the money supply growing at a constant rate in order to keep the rate of inflation low and constant. The choice of which money supply to target was based on how closely it was statistically correlated with GDP. The Read More ›

Large Language Models Can Entertain but Are They Useful?
Humans who value correct responses will need to fact-check everything LLMs generateIn 1987 economics Nobel Laureate Robert Solow said that the computer age was everywhere—except in productivity data. A similar thing could be said about AI today: It dominates tech news but does not seem to have boosted productivity a whit. In fact, productivity growth has been declining since Solow’s observation. Productivity increased by an average of 2.7% a year from 1948 to 1986, by less than 2% a year from 1987 to 2022. Labor productivity is the amount of goods and services we produce in a given amount of time—output per hour. More productive workers can build more cars, construct more houses, and educate more children. More productive workers can also enjoy more free time. If workers can do in four Read More ›

Chatbots: Still Dumb After All These Years
Intelligence is more than statistically appropriate responsesThis story, by Pomona College business and investment prof Gary Smith was #6 in 2022 at Mind Matters News in terms of reader numbers. As we approach the New Year, we are rerunning the top ten Mind Matters News stories of 2022, based on reader interest. At any rate: “Chatbots: Still dumb after all these years.” (January 3, 2022) In 1970, Marvin Minsky, recipient of the Turing Award (“the Nobel Prize of Computing”), predicted that within “three to eight years we will have a machine with the general intelligence of an average human being.” Fifty-two years later, we’re still waiting. The fundamental roadblock is that, although computer algorithms are really, really good at identifying statistical patterns, they have no way of Read More ›