Mind Matters Natural and Artificial Intelligence News and Analysis
tropical hurricane approaching the USA.Elements of this image are furnished by NASA.

Female Hurricanes: How A Mass of Hot Air Became a Zombie Study

When a reporter first asked me about a study claiming that “Female Hurricanes are Deadlier than Male Hurricanes,” I was skeptical

It is once again hurricane season on the East Coast and social media are reliably abuzz with reports of a debunked study that claimed that hurricanes given feminine-sounding names are deadlier than those given masculine-sounding names. (Full disclosure: I was one of the debunkers.)

When a reporter initially asked me about this study, titled “Female Hurricanes are Deadlier than Male Hurricanes,” I was skeptical. Hurricane names alternate between female and male on a schedule set before the hurricane season begins so any relationship between name and death toll was surely coincidental.

I looked at the study and discovered that the authors were not arguing that female hurricanes are inherently deadlier but that sexist humans die because they don’t take female hurricanes seriously. That argument is not completely nuts and the open-access paper was published in the Proceedings of the National Academy of Sciences (PNAS). Still, not completely nuts is not the same thing as true and prestigious journals sometimes publish rubbish. Consider, for example, the British Medical Journal’s publication of a nutty paper claiming that Asian-Americans are so terrified of the number 4 that they have heart attacks on the fourth day of the month and the Lancet’s publication of a paper claiming that Jews can postpone death until after Passover.

The hurricane paper is almost as bad. In 1950, 1951, and 1952, hurricanes were named using the military phonetic alphabet (Able, Baker, Charlie… ). For surely sexist reasons, female names began to be used in 1953. Many protested, including Roxcy Bolton: “Women are not disasters, destroying life and communities and leaving a lasting and devastating effect.” The current system of alternating male and female names began in 1979 when President Jimmy Carter’s Secretary of Commerce, Juanita Kreps, ordered the change.

The biggest problem with the PNAS study is that it includes the 1953–1978 data, when hurricanes were deadlier and all hurricanes had female names. The average number of deaths per hurricane was 29 during the all-female era and 16 afterward. Perhaps there were more fatalities in earlier years because hurricanes tended to be stronger (the average hurricane category was 2.26 during the all-female era and 1.96 afterward). The infrastructure may also have been weaker or perhaps there was less advance warning. A valid study would only include post-1979 data when male and female names alternated.

There were many other problems with the PNAS study, including the omission of several deadly hurricanes because they did not make landfall. Consider this National Hurricane Center description of Hurricane Bill in 2009:

Large swells, high surf, and rip currents generated by Bill caused two deaths in the United States. Although warnings about the dangerous waves had been posted along the coast, over 10 000 people gathered along the shore in Acadia National Park, Maine, on 24 August to witness the event. One wave swept more than 20 people into the ocean; 11 people were sent to the hospital, and a 7-yr-old girl died. Elsewhere, a 54-yr-old swimmer died after he was washed ashore by large waves and found unconscious in New Smyrna Beach, Florida.

R.J. Berg, L.A. Avila, “Annual summary: Atlantic hurricane season of 2009” at Mon. Weather. Rev., 139 (2011), p. 1054

Hurricane Bill was omitted from the PNAS study’s data because the hurricane didn’t quite make it to shore. But, surely, this episode is evidence that people didn’t take Hurricane Bill seriously.

The PNAS study also concluded that female names “cause” more deaths during major storms but that there is “no effect of masculinity-femininity of name for less severe storms.” This doesn’t make sense. If there is any difference in reaction to male and female storms, it should be clearest for minor storms. The biggest storms should blow away any sexism.

Consider Hurricane Sandy, the deadliest post-1978 hurricane in their data set. First of all, Sandy is generally considered a unisex name, but the PNAS authors considered it to be strongly feminine, with a score of 9.0 (strongly feminine) on a scale of 1 to 11, which is more feminine than Edith (8.5), Carol (8.1), or Beulah (7.3). Something is surely amiss. I know far more men named Sandy than men named Edith, Carol, or Beulah.

During the week before it reached New York, Hurricane Sandy made landfall in Jamaica, Cuba, the Dominican Republic, Haiti, and Puerto Rico, killing 80 people and causing widespread property damage. As Sandy approached the United States, nine U. S. governors declared a state of emergency and New York City Mayor Michael Bloomberg suspended all city mass transit services; closed public schools; and ordered mandatory evacuations of many parts of the city.

Nonetheless, there were 48 New York City fatalities and another 109 fatalities in other parts of the United States. Is it really credible that, despite the dozens of people killed before Sandy hit the United States, the nonstop catastrophic warnings broadcast by news media, and the extraordinary actions taken by elected officials, people did not take Hurricane Sandy seriously because they considered Sandy to be a female name? Had New Yorkers forgotten Sandy Koufax?

Also, Sandy was downgraded from a hurricane to a post-tropical cyclone when it made landfall in New Jersey. Should the PNAS authors have included deadly hurricanes (like Bill) that did not make landfall? Should they have excluded storms (like Sandy) that were no longer hurricanes when they made landfall? Should they have included storms that were deadly but didn’t reach hurricane status? Who knows. And that’s the point.

Yet another gaping hole in the study is the fact that the authors reported trying a large number of models with various combinations of variables and functional forms (and they no doubt chose to report the cherry-picked model that most supported their questionable theory). Ransacking data seldom ends well. It’s the kind of thing that mindless artificial intelligence algorithms do and it often ends badly. Humans should know better.

I tested the robustness of the reported results using a more inclusive set of post-1978 data. I considered tropical storms as well as hurricanes, Pacific storms, storms that did not make landfall or made landfall in other countries, and non-U.S. fatalities.

Tables 1 and 2 show that a direct comparison (in the first two data columns) of the frequency with which male-named and female-named storms caused fatalities, caused 1 to 99 fatalities, or caused more than 99 fatalities shows no consistent pattern. A comparison (in the third and fourth data columns) of the average number of fatalities shows that male-named storms generally had a higher average number of fatalities though none of the differences are statistically significant.

Storms by fatalitiesNumber of Storms Average Fatalities 
Storm namesFemale NamesMale NamesFemale NamesMale Names
All storms21021036131
Storms with fatalities11110368267
Storms with 1 to 99 fatalities103881214
Storms with more than 99 fatalities8157921752
Table 1 Fatalities From Atlantic Hurricanes and Tropical Storms
Storms by fatalitiesNumber of Storms Average Fatalities 
Storm namesFemale NamesMale NamesFemale NamesMale Names
All storms29328648
Storms with fatalities42462851
Storms with 1 to 99 fatalities394297
Storms with more than 99 fatalities34271519                         
Table 2 Fatalities From Pacific Hurricanes and Tropical Storms

The conclusion is inescapable: The PNAS study was a mass of hot air based on a questionable statistical analysis of inappropriate data and it doesn’t hold up with fresh data.

Why do good journals published bad studies? Some journals don’t have the resources for careful scrutiny. Some like the publicity that comes with provocative articles. Some welcome confirmation of their prejudices.

Unfortunately, flawed research studies with provocative conclusions may live in popular culture long after they are exposed. Like this hurricane paper, they become zombie studies.

You may also enjoy these pieces by Gary Smith:

A Baby, a Geek, and a Cow … all walk into a bar looking for some BEER and VINO… The surprising importance of a stock’s name.

We see the pattern! But is it real? It’s natural to imagine that a deep significance underlies coincidences. Unfortunately, patterns are not always a source of information. Often, they are a meaningless coincidence like the 7-11 babies this summer.

Gary N. Smith

Senior Fellow, Walter Bradley Center for Natural and Artificial Intelligence
Gary N. Smith is the Fletcher Jones Professor of Economics at Pomona College. His research on financial markets statistical reasoning, and artificial intelligence, often involves stock market anomalies, statistical fallacies, and the misuse of data have been widely cited. He is the author of The AI Delusion (Oxford, 2018) and co-author (with Jay Cordes) of The Phantom Pattern (Oxford, 2020) and The 9 Pitfalls of Data Science (Oxford 2019). Pitfalls won the Association of American Publishers 2020 Prose Award for “Popular Science & Popular Mathematics”.

Female Hurricanes: How A Mass of Hot Air Became a Zombie Study