Learning the Wrong Lessons: Lies damned Lies and Statistics

Every day brings more news about new coronavirus cases and a rising death toll.  The figures are put under intense scrutiny. Questions being asked are

  • Is the rate of infection increasing or slowing?
  • Is the figure accurate, does it confuse those who die from an illness with those who die with?  Does it include all Coronavirus deaths or just hospital ones?
  • How do we compare with other countries?

This is a particular cause for interest, concern and controversy because some news outlets have suggested that the UK is doing worse than other countries and likely to have one of the highest death rates in Europe and possibly the world.  If you compare Britain with Germany, a country with a comparable sized population then we have seen 16, 509 deaths to date whilst Germany have seen 4,865. It seems like a large gap.

We have discussed before about the problem of comparing like for like due to variations in where we are on the time-line as well as differences in terms of population density, cultural response to social distancing measures, age and migration/mobility.

However comparing like for like is further hindered because what seems like a large difference at first might not be that large. Let me illustrate.  Remember the EU referendum result, it was about 52% -48% (in fact without rounding up it was even closer, 51.89% -48.11%). How often have we been told that the result was close so that the country is practically split evenly down the middle.  Yet that “close” result was actually 17.4 million to 16.1 million. There were over 1 million votes in it.  The straight figure looks massive, as though the leave campaign romped home but we know that’s not how it works. We know that it was close because we need to look at things proportionately.

Now let’s talk COVID-19.  If you look at mortality as a percentage of the population, it works out at about 0.025%. That’s infinitely small. Now, that’s not a reason to ignore the numbers because each death is distressing, because the figures could have been far higher and because of the wider impact on care services of treating people. However, it should make us stop and think before making judgemental comparisons with others. For comparison, German fatalities are at 0.005% of the population, 0.02% less than the UK. Statistically, you would not be usually considering that to be a significant variation. Indeed, it is the sort of miniscule variation that could be due to a whole host of factors beyond either country’s control.[1]

Now, I also wanted to pick up on one of the other issues. Can we be sure that all the deaths have been correctly included in the data. If the data leaves out some deaths -those outside of hospitals but maybe includes others – deaths with rather than from the virus then how accurate are they?  I have seen some people respond by suggesting that this means the data is meaningless.

Well, if we cannot compare with others and if the data is not completely clean, they maybe it is meaningless. However, that is only so if the aim at this stage is to use the date to create league tables and/or to publish as results to show success or failure.  However, we are a long way away from attempting to decide ultimate success or failure and creating league tables is certainly not the game we are in.

The purpose of the data is not to measure results compared to others. Rather it is used for what we would have called in industry Statistical Process Control.  The data is used to monitor how the virus is developing and whether or not the government is on course with its action plan.  It helps answer questions about the pace and spread of the virus. It tells us whether or not the numbers are accelerating exponentially or beginning to plateau or decline.  It gives a feel as to how close we are to the best, possible and worst forecasts.  For those purposes, the data is doing exactly what it is meant to do.

This has wider application to life. So often the problem with data and statistics is not the data itself. Rather it is that we don’t know how to interpret it and use it. That’s why statistics gets a bad name as equivalent to lying.  Yet used properly, data is important to our understanding of progress and helps provide early warnings of problems.

In church life, there is a healthy level of suspicion about counting and if we count numbers of attendees, baptisms etc then the risk that it becomes a means to compare and boast is real. However data and analysis can be useful.  I am interested in things like

  • Are we seeing people coming and settling or coming and staying?
  • Where are people coming from (other churches or new converts)?
  • Are we seeing conversions and baptisms
  • Do we reflect the culture of our community (race, class, gender, age)?
  • Have we got the capacity (space, leadership  etc) to care for the congregation?

Data can play a helpful part in getting us to stop and think about those questions. It is important that we then use it properly and wisely not to boast and not to become legalistically obsessed by the stats but to help us think carefully and make wise decisions.

[1] In the political vote count context you are well into recount territory.

%d bloggers like this: