The Extra Mile: Statistically speaking...
Note: This is a special post supplementing The 12th Block
All statistics have outliers. ― Nenia Campbell
I was not great at mathematics – although I did get A1’s for both my papers in secondary school, I opted for computer science subjects in lieu of university-level maths because I knew my limits.
Recent months have shown the importance of advanced mathematics. Whether it is COVID-19 or police brutality data, there have been plenty of cases of people misrepresenting and misunderstanding the statistics. There have also been plenty of good data journalism and well-informed influencers correcting them, and I consolidate some of them here. This post is a supplement to The 12th Block.
Facts are stubborn things, but statistics are pliable. ― Mark Twain
DIY Hand sanitisers
The data: Alcohol-based sanitiser must contain at least 60 per cent alcohol to be effective against COVID-19 (CDC).
The mistake: Makers on social media started sharing creative ideas for homemade hand sanitisers, considering that in the early days of the pandemic store shelves were cleared off the product as people were panic buying, but many failed to make effective ones. An example recipe used 2/3 of a cup of 70 per cent isopropyl alcohol and 1/3 of a cup of aloe vera gel.
The correction: While the maker was trying to do the right thing by using at least 70 per cent alcohol, the mixture ratio dilutes the alcohol to 46 per cent, a level well below the CDC’s recommendation.
As explained by Ann Reardon in her video, Debunking Viral Covid-19 Videos:
[The alcohol] has to be [at least] 60 per cent of the end result – not of the thing you are starting with. That’s how it works.
Most people use statistics like a drunk man uses a lamppost; more for support than illumination. ― Andrew Lang
The importance of proportional representation
In June, a teen Tiktoker Hayley Clark shared a series of videos of her parents questioning statistics about race-based police brutality.
The data (according to the parent): Denying that black people were more likely to be killed by the police, the parent said that, “in 2017, 457 white people were shot to the death by the police in the US; 223 were black.”
The mistake: Raw data, in some cases (see NOTE 1), should be adjusted for proportional representation.
The correction: Hayley responded:
76 per cent of the population is white; 13 per cent is black. If they were being killed the exact same rate by police officers, the rate that black people would be 8.9. But it’s not, it’s 24 per cent [...] so they’re being killed at a higher rate. There is more white people, meaning the amount of people killed by cops who are white would be higher.
Hayley and her parent could likely have quoted different sets of reports, I do not have the original sources to confirm, but the case stands that the interpreted data has to be representative of the target population group(s) and not taken without such context.
[NOTE 1: I say, “in some cases,” because it does not work for all. For instance, with COVID-19 data, a very small population will certainly report a far smaller number of infections than a very populous country. A country with 1,000 people may report 900 cases, and a country with 1,000,000 people may report a hundred times more cases (900×100=90,000), but the former has 90 per cent of its population infected, while the latter only has 9 per cent. Granted, 90,000 cases is still a lot. Additionally, country-based data on COVID-19 cases are often presented per capita, or per million people, yet, that alone does not tell you anything about whether or not any of the countries are actually struggling to contain the outbreak. Other information has to supplement it to derive an informed interpretation of the data.]
“Every line is the perfect length if you don’t measure it. ― Marty Rubin
To mask or not to mask?
Remember when some countries flip-flopped on the advisory about face-mask usage, initially saying that masks do not work against containing the spread of COVID-19, and then changing their minds about it, first by making it voluntary and now, at least in some places, making it mandatory?
The data: It is potentially based on a now-retracted study, using a very small sample of COVID-19 patients (n=4). They made the patients (again, n=4) cough into a petri dish without any mouth barrier, then repeated the process with a surgical mask, and repeated it again with a cotton mask, and one final time without any mask. This was the result, as presented, again by the brilliant Ann Reardon, from the same video mentioned above:
The mistake: The data was presented in log value but picked up and reported as though they are straight numbers. A log scale is normally used to compress large values (see NOTE 2) so that a larger range can be covered without losing sight of the smaller value, or without needing to tape together several graph papers to capture all your data in a single graph. So something like this:
What it also means is that if you zoom out on the linear scale to capture 0-10 and zoom in on the logarithmic scale to capture the same values, the nodes are spaced differently:
The correction: While it is important to note that masks do not completely stop the spread of the virus, the linear scale shows exactly how much more effective it is. Here is Ann Reardon’s linear graph to visualise just that:
[NOTE 2: As you can see, you need to zoom all the way out to capture all the values in a linear graph for this set of data. It is also interesting to note that cotton masks seem to be more effective than surgical masks in this study, although again, I repeat, n=4, and also, the study was retracted. Another paper published in the Journal of Hospital Infection showed different results. Spoiler: Vacuum filters are more effective against COVID-19 than cotton mix.]
There are three types of lies – lies, damn lies, and statistics. ― Benjamin Disraeli
5G and COVID-19 do have something in common…
…But it is not what you think. One of the biggest conspiracy theories regarding COVID-19 is that it is caused by 5G. You should know by now that that argument does not hold water.
The ‘data’: Two maps placed side by side, one showing where 5G technology has been installed, the other showing coronavirus hotspots. The side-by-side image claimed to show evidence that there is a link between COVID-19 and 5G technology.
The mistake: The only thing the image proves is that the maps correlate to population density.
The correction: New technology is often rolled out in more populous regions first because more people would be there to try it out. At the same time, a more populous area also means that there are more social interactions between individuals, thus, increasing the spread of the virus. As many critical thinkers have already pointed out, including Doug Collins, by the same logic, Subway, McDonald’s and the KKK are just as likely to have caused the pandemic as 5G towers:
Nothing in life is to be feared, it is only to be understood. Now is the time to understand more, so that we may fear less. ― Marie Curie
As a conclusion, I repost a quote from The Logic of Science:
Over the course of the COVID crisis, we have repeatedly seen leading scientists and scientific organisations change their recommendations, and we have seen multiple scientific studies retracted or at least highly debated. Many view this as proof that science doesn’t work and/or scientists don’t know what they are doing. In reality, this is exactly what we expect to see when science works. Science is a method, not a body of facts, and the method is often messy. Peer-review does not end with publication. Rather, studies are subjected to the scrutiny of the entire scientific community, and the fact that high-profile papers sometimes get retracted is evidence of science correcting itself. Similarly, the fact that scientists change their views as new evidence about a novel virus comes to light is a good thing! It means that scientists are learning and adjusting their views rather than clinging to biases and preconceptions. That’s how science works.
And, please watch Ann Reardon’s video to completion. The hyperlinks provided in the parts of this text that referenced her video take you to the exact timestamps in which she discussed the issues I quoted her for. However, the full video dives a lot deeper than that and it is worth the entire 21 minutes and 11 seconds. If you cannot tell, I am a big fan but I did not ask for her permission to rehash her content, and I hope she does not hate me. So if you watch her video(s) and subscribe to her channel because I had recommended it to you, I hope it will make up for it.