For the past 18 months, hunkered down in his Tel Aviv apartment, Ariel Karlinsky has scoured the web for data that could help him calculate the true death toll of Covid-19.
The 31-year-old economics student at the Hebrew University of Jerusalem had never worked on health matters before, but he was troubled by rumours early in the pandemic that Israel was not experiencing a rise above expected death rates, and therefore Covid was not serious.
“This was, of course, not true,” he said. “Excess mortality was definitely there and it was definitely very visible.” He pulled up the numbers to prove it, which was easy enough to do in Israel with its sophisticated vital registration system.
But other rumours followed. One was that countries that had put in place no or minimal containment measures, such as Russia, were not experiencing significant excess mortality either. Again it was not true – but getting hold of the data to prove that was trickier.
Karlinsky realised this was the case for most countries. Even those that routinely gathered excess-mortality data often did not publish it until at least a year later, meaning they were unaware of a sensitive indicator of the pandemic’s scale and progress – one that could inform their response.
It became a challenge to gather that data for as many countries and in as close to real time as possible.
Through Twitter he encountered another researcher, the data scientist Dmitry Kobak of the University of Tübingen in Germany, who was attempting the same thing, and they agreed to collaborate. While Karlinsky searched for the numbers, Kobak took on the analysis.
The result is the World Mortality Dataset, which forms the basis of estimates of Covid mortality as published by the Economist, the Financial Times and others, and which gives the lie to the official global death toll of 4.8 million. The Economist, for example, puts the real number closer to 16 million.
Those who measure the impact of public health disasters have applauded Karlinsky and Kobak’s effort. “This is a data revolution that parallels that seen in vaccine development and pathogen sequencing,” the epidemiologists Lone Simonsen, of Roskilde University in Denmark, and Cécile Viboud, of the US National Institutes of Health, wrote.
A pandemic’s death toll can be measured in various ways, all of which have advantages and disadvantages. The official number is derived from national reports of Covid deaths but these depend on testing rates and are almost always underestimates.
“The official Covid death tolls are just not credible at all for a large group of countries,” said the data journalist Sondre Ulvund Solstad, who leads the Economist’s pandemic tracking effort.
Excess mortality, defined as the increase in deaths from all causes over the level expected based on historical trends, does not depend on testing rates. It is an old tool, having been used to estimate the death tolls of historical pandemics – especially where there was no diagnostic test for the disease in question – but until now it has always been calculated retrospectively.
Karlinsky and Kobak’s innovation is to collect and publish the data during a pandemic, for a swathe of the world, using established statistical techniques to fill in the gaps.
One disadvantage of excess mortality is that it is a composite. It captures not only Covid deaths but also deaths indirectly linked to the pandemic, such as those of cancer patients who could not get timely treatment or the victims of domestic abuse during lockdowns, without telling you much about the relative contributions of each.
By comparing the timing of excess-mortality peaks and lockdowns, however, Karlinsky and Kobak have shown that, in the case of Covid, excess mortality mainly reflects deaths from the disease.
Calculating excess mortality can also generate some strange results. In June, for example, they reported in the journal eLife that excess mortality had been negative in countries including Finland, South Korea and Australia – meaning fewer people had died there than in previous years – because those countries’ pandemic control had been excellent and they had also all but eliminated flu in 2020. In such cases, according to Simonsen and Viboud, official Covid deaths are a more accurate indicator of the pandemic’s toll.
The World Mortality Dataset contains information on more than 100 countries. Among those missing are most African and many Asian countries, including some of the world’s most populous and – judging by news reports and other sources – worst-affected. India, for example, does not routinely release national vital data, yet some researchers estimate its Covid death toll could be as high as 4 million.
Karlinsky and Kobak have scraped subnational data sources from these data-poor countries – or been supplied them by journalists, academics and dissidents living there – and applied various techniques of extrapolation to produce national estimates.
Or they have projected from neighbouring countries where data is available, adjusting for such factors as population density, Covid testing strategy and press freedom.
Uncertainty in the data is why Karlinsky and Kobak have avoided estimating the global death toll, but they say that nationally, excess deaths are 1.4 times higher than reported Covid deaths, on average, which would give a rough global tally of 6.7 million.
Solstad’s modelling put the number between 9.9 million and 18.5 million, a range that Simonsen found reasonable.
To put these numbers in a historical perspective, she and Viboud took excess-mortality estimates for previous pandemics and adjusted them for the world’s population in 2020.
This gave death tolls for the previous four flu pandemics, if they had happened now, of 75 million (1918), 3.1 million (1957), 2.2 million (1968) and 0.4 million (2009).
Covid is the deadliest pandemic in a century, they conclude, “but has nowhere near the death toll of the pandemic of 1918”.
The new dataset shows countries that attracted international headlines for having severe outbreaks, such as Italy, Spain and the UK, have not actually been the worst affected.
The worst include Mexico and Bolivia – but also some countries in eastern Europe, which have experienced more than a 50% increase in mortality. The worst affected, Peru, has recorded a 150% increase.
The dataset becomes more precise over time because some data trickles in with a time lag. Some countries asked their national statistics offices to accelerate the collection and publication of vital data early in 2020, but others either could not or would not release it. Turkey was expected to release monthly vital data for 2020 early this summer. It has not.
“Turkey is a prime example of a place where they have the numbers but they are not releasing them because they do not want to explain the discrepancies,” Karlinsky said.
In fact, he said, excess mortality could open a revealing sidelight on government transparency. If official Covid deaths were lower than excess deaths but followed roughly the same trajectory, it was likely the country simply lacked testing or vital registration capacity.
Solstad thinks excess mortality should be tracked continuously in future, because it would provide better insights into all kinds of crises, including wars and famines. “It’s a pretty objective measure of things going wrong,” he said. Karlinsky agrees. When a heatwave struck Egypt in 2015, for example, state media reported 61 deaths; his estimate was closer to 20,000.
Some countries may not wish to do so. In February, the World Health Organization took the first step towards harnessing excess mortality as a surveillance tool, when it set up an expert committee to assess Covid mortality.
Governments could act more quickly and proportionately if they know a crisis is imminent. They would also be better equipped to convince the public of the need to do so. “Some people truly believe that if we hadn’t done anything to stop this virus not much would have happened,” Simonsen said. What the World Mortality Dataset showed, she added, was that in many countries “a lot happened”.