Making Sense of the Numbers Behind COVID-19

Media and politicians put statistics before us to sway our opinions. But what do they really mean?

Language matters and it is sometimes used in a confusing way. For example, we hear talk of “killing” viruses, even at sites sponsored by medics. But, unlike bacteria, viruses are not alive. COVID-19 is an RNA encased in a lipid membrane. So the sentence “Wash your hands to kill the virus” is technically wrong. Washing your hands can destroy the virus and prevent its spread but we can’t kill what was never alive.

A bigger problem is the misleading way media and politicians quote and compare information about COVID-19 cases and deaths:

The risk is that politicians, business owners and ordinary Americans who are making decisions about lockdowns, reopenings and other day-to-day matters could be left with the impression that the virus is under more control than it actually is.

In Virginia, Texas and Vermont, for example, officials said they have been combining the results of viral tests, which show an active infection, with antibody tests, which show a past infection. Public health experts say that can make for impressive-looking testing totals but does not give a true picture of how the virus is spreading.

Michelle R. Smith, Colleen Long and Jeff Amy, “blockquote>States accused of fudging or bungling COVID-19 testing data” at Associated Press (May 19, 2020)

So how can we find out what we need to know? First, we should look at how the numbers of deaths from COVID-19 are reported. In doing so, we might seem insensitive to the fact that these numbers represent actual people who have died, oftentimes alone and intubated. But we do need to look at those numbers so we can understand how the disease works (epidemiology). That, in turn, can help us minimize the number of vulnerable people exposed to it:

Deaths per capita is a more significant figure than deaths overall

When comparing the number of COVID-19 cases and deaths between two different locations, it’s important to consider the number per capita (that is, per 1,000 or 1,000,000 people).

Think back to algebra; it’s like finding a common denominator. Would you buy a product if 9 out of 10 people recommend it? What if 9 out of 100 people recommend it? It makes a difference, when assessing an impact, how many people we’re talking about overall.

For example, here are two commonly cited claims:

“The U.S. has the most deaths from COVID-19.”

It is true that (as of this writing) the U.S. has 96,610 deaths due to COVID-19, which is higher in absolute terms than any other country. However, if you look at the number of deaths per 100,000 people, the U.S. ranks 12th in the world.

Similarly, the U.S. has more than 1.6 million cases of COVID-19, a higher number than any other country. Case numbers are likely underreported in the U.S. because of testing. But when you look at cases per 100,000 people, the U.S. ranks 11th or 12th. Italy has about one-third the number of deaths as the U.S. (32,735) but Italy’s death rate per 100,000 people is 54 compared to that of the U.S. at 30 (1.8× that of the U.S.).

“The U.S. has conducted the largest number of COVID-19 tests.”

It’s true the U.S. has conducted the largest number of COVID-19 tests — over 9 million of them. But if we look at the number of tests per 1,000 people, the U.S. is somewhere around 10th in the world at 42.7 tests per 1,000 people.

Thus, the bare statements above are not disingenuous because they are both technically true. The U.S. does have more deaths than any other country and it has conducted more PCR tests than any other country. But the U.S. also has a larger population than most countries. When we compare two countries, we must make sure that the data takes into account differences in population numbers.

Per Diem: Not every day is the same

Another misunderstanding in reporting COVID-19 numbers is the significance of the number of new cases per day and deaths per day.

Perhaps you, like me, check the number of new cases and deaths in your state and county every day. And perhaps you’ve noticed that the numbers are lower on the weekends than on the weekdays. Viruses do not take a break on the weekends, but people do. In reality the daily death counts are not necessarily the deaths that happened that day. A weekly total is probably a better way to look at the data than a daily total. Here’s why:

Let’s suppose someone dies of COVID-19 on Sunday but fewer people work on the weekend than during the week, even in healthcare. The Sunday death may not be verified until Monday, then perhaps reported to the county on Tuesday. In Texas, for example, if a death from COVID-19 is reported by 1:00 pm, it goes on that calendar day’s count. If it is reported after 1:00 pm, it goes on the next day’s count. So the final figures for a day may depend on when counties checked in with their counts.

Coronavirus disease COVID-19 infection 3D medical illustration. Floating China pathogen respiratory influenza covid virus cells. Dangerous asian ncov corona virus, dna, pandemic risk background design

So when you see higher numbers on Tuesday and Wednesday, it’s probably due to the lag between reported cases and reported deaths than the day of the week. This lag isn’t necessarily a bad thing. We certainly want people to verify case counts and cause of death before reporting them. But things get confusing when reporters use language such as, “Tuesday had the largest number of deaths since March.”

Furthermore, sometimes numbers from the previous day are corrected the next day. The CDC outlines why there might be differences in local and national data as well as why there is a lag in counts.

The main thing to see is that these daily spikes or declines reflect human activity patterns, not the virus’s activity patterns.

One statistic that might be helpful, especially in the long term, is “excess deaths.” The figure is a comparison between the number of deaths a country or state typically sees at this time of year compared to the deaths in 2020. This number can be reported in a couple of different ways. The CDC uses the average number of deaths dating back to 2013. The New Atlantis offers a graph of excess deaths for the U.S. in 2020 compared to deaths in 2018. Excess deaths, as a figure, shows the effect of COVID-19 on society, whether from a strained medical system or due to co-morbidities (the patient was already weakened by other conditions). It can also account for uncounted COVID-19 deaths.

Where do the numbers we hear about come from?

Google “COVID-19 number of deaths per capita” or “COVID-19 statistics” and Worldometer will probably be among the top results. This site is a data aggregator quoted by various politicians and institutions. But CNN looked into Worldometer and found that it is not the most reliable source of information. The site aggregates live stats, but their source is unclear: Edouard Mathieu, data manager of Oxford University’s Our World Data, shared his misgivings with CNN:

Their main focus seems to be having the latest number wherever it comes from, whether it’s reliable or not, whether it’s well-sourced or not,” he said. “We think people should be wary, especially media, policy-makers and decision-makers. This data is not as accurate as they think it is.

Scott McLean, Laura Perez Maestro, Sergio Hernandez, Gianluca Mezzofiore and Katie Polglase, “The Covid-19 pandemic has catapulted one mysterious data website to prominence, sowing confusion in international rankings” at CNN

A better source of data is the COVID Tracking Project , which clearly outlines who is involved in acquiring data, analysis, and reporting, as well as where they get their numbers. Still, the New Atlantis points out that even the COVID Tracking Project’s data are debatable because the site counts the number of people who died from COVID-19 as well as those who died with COVID-19.

Additionally, how data is displayed influences how the viewer interprets it. For example, many people understand a linear graph of new cases of COVID-19 per day better than a logarithmic graph, even though logarithmic graphs can better show differences in the data. These differences may be small but important when it comes to ways viruses spread in a population. 91-DIVOC is a website designed by computer scientist Wade Fagen-Ulmschneider of the University of Illinois, using data from Johns Hopkins University, allows the user to toggle between a logarithmic and linear graphs comparing new cases in various countries and states. He says the logarithmic scale is good for looking at the magnitude of growth while the linear scale is good for looking at real human impact.

The COVID-19 pandemic leaves many of us hungry for any morsel of information that we can glean from the media, politicians, and social media. But, good media discernment means stepping back and asking questions, such as “What does this mean?” or “Where do these numbers come from?” As we have seen, the Chinese government is working hard to take advantage of people’s tendency to consume information without asking questions. Let’s make that hard for them or anyone else to do.

Heather Zeiger

Heather Zeiger is a freelance science writer in Dallas, TX. She has advanced degrees in chemistry and bioethics and writes on the intersection of science, technology, and society. She also serves as a research analyst with The Center for Bioethics & Human Dignity. Heather writes for, Salvo Magazine, and her work has appeared in RelevantMercatorNet, Quartz, and The New Atlantis.

