The virus spreads disregarding the barriers of gender and age. But the symptoms are age and gender dependent. How does this enter into the data?
People can say what they want to say, but there is simply no getting around dealing with the actual numbers. Numbers such as the number of cases, number of deaths, needed hospital beds, needed ventilators and witnessing the effects of one response compared with another on this set of variables.
Suzanne Loftus, theglobalist.com
Let us look at the Num63r5.
Definitions
As we basically all know, the coronavirus we are talking about is named SARS-CoV-2. The pulmonary disease that it provokes is called COVID-19.
When the virus has infected one person’s body and the current infection could be detected by some approved test (e.g. swabbing), then this person is currently infected with the SARS-CoV-2. A person who was infected and is no longer infected, is a recovered person. An infected person who died is included in the group of deaths.
Therefore, the total number of infected is made of the currently infected, the recovered, and the dead of COVID-19.
We know – with a good approximation – the number of deaths caused by the sickness COVID-19, as a consequence of the infection by the virus SARS-CoV-2.
For the group of currently infected and recovered we know much less. In fact, the numbers that we have concern the total number of tests performed through swabbing and to a greater detail we know the number of positively tested people, namely people who got the swabbing test and the SARS-CoV-2 virus was detected. The greater detail concerns their regional densities and their age distribution.
The big issue is then the following: we don’t really know the number of currently infected people and we don’t know the number of those who have recovered. The reason is that not everyone has been or could be tested due to the limited capacity of the testing facilities. The tests based on the detection of antibodies may eventually remedy some of these problems.
The number of positively tested people is based on a selection criterion: in most of the countries, only people presenting symptoms and/or at risk have been tested.
Since the symptoms are age-dependent, the demography of the positively tested is obviously very different from the demography of the infected with the SARS-CoV-2 virus.
Demography: positively tested vs population age structure
We can compare the numbers and compute the demography of the positively tested vs the demography of the Italian population. We focus on the men. A similar calculation can be done for the women population.
Age class | Positives | Percent of positives | Demography | Deaths | Percent of deaths |
---|---|---|---|---|---|
0-9 | 596 | 1% | 9% | 0 | <0.01% |
10-19 | 901 | 1% | 10% | 0 | <0.01% |
20-29 | 3350 | 4% | 11% | 5 | 0.04% |
30-39 | 5344 | 7% | 12% | 28 | 0.2% |
40-49 | 9009 | 11% | 16% | 133 | 1% |
50-59 | 14779 | 19% | 16% | 606 | 5% |
60-69 | 14963 | 19% | 12% | 1776 | 14% |
70-79 | 15577 | 20% | 9% | 4532 | 35% |
80-89 | 12332 | 16% | 5% | 4992 | 38% |
90- | 2470 | 3% | 1% | 970 | 7% |
Total | 79321 | 13042 |
From the percentages, we see that the age classes until the age of 49 are under-represented in the statistics of the positives whereas the older age classes are over-represented.
My claim is that the age distribution of the infected should rather follow the demography instead of reflecting the percent of the positively tested. The reasons are:
- 1) The virus does not know the age of the person before infecting the person. Its spread should be uniform in the entire population.
- 2) There is so far no evidence that the positivity of a test depends on the age.
- 3) There is evidence that the symptoms are age-dependent.
- 4) People with symptoms are more likely to be tested than people without symptoms.
The percentages of positive in the various age classes (column three in the above table) reflect a mixture of demography and probability to experience COVID-19 symptoms and it does not reflect the true age-distribution of the infected.
Number of infected and age-dependent death rate
There is some evidence that the overall death rate is possibly below 1%. A few studies do even indicate that the infection fatality rate may be as small as 0.4% (as a reference: for influenza it is 0.01%). Let us still assume that the overall death rate is rather large, i.e. 1%. What we know with a certain accuracy is that the number of deaths in the Italian male population is 13042 (as of April 16, after excluding the cases of unknown age).
With 1% overall infection fatality rate this makes a total number of infected equal to 1304200 men as of April 16 (some corrections may be necessary because the process is not at steady-state, in which case we should say that this number refers to the infected 10 days before. This makes no difference for the calculation that follows).
This number should be distributed in the various age classes based on the demography. Thus we have a new table with the number of infected per age based on the infection fatality rate of 1%:
Age class | Infected | Deaths | Infection Fatality Rate |
---|---|---|---|
0-9 | 111200 | 0 | <0.01% |
10-19 | 123556 | 0 | <0.01% |
20-29 | 143462 | 5 | <0.01% |
30-39 | 156504 | 28 | 0.02% |
40-49 | 208672 | 133 | 0.06% |
50-59 | 208672 | 606 | 0.3% |
60-69 | 156504 | 1776 | 1% |
70-79 | 117378 | 4532 | 4% |
80-89 | 65210 | 4992 | 8% |
90- | 13042 | 970 | 7% |
Total | 1304200 | 13042 | – |
We see that the age-dependent infection fatality rate changes a lot across age classes but it takes values that probably reflect more realistically the risk of death in each one of the classes.
Some details of the calculation (skip it!)
In the above table, the number of infected in any age class is computed as follows:
infected (age_class = i) = Total_infected x demographic_proportion (age_class = i)
with some rounding to the next integer. Here, we also have
Total_infected = Total_deaths/infection_fatality_rate.
The age-dependent fatality rate is then computed as
fatality_rate (age_class = i) = deaths (age_class = i)/infected (age_class = i).
Based on this calculation, the age-dependent fatality rate grows linearly with the infection fatality rate with a coefficient that is also age-dependent:
fatality_rate (age_class = i) = [death_proportion (age_class = i)/demographic_proportion (age_class = i)] x infection_fatality_rate .
Upper limit to the overall death rate
The age-dependent death rates would obviously change as the overall death rate changes. So for instance with an overall death rate of 0.4%, the death rate for the age class above 80 is 3%. With an overall death rate of 2% instead, the death rate of the older age classes becomes 15%.
The above calculation indicates that assessing the overall infection fatality rate is crucial to assess the real risk of each age class especially for the weakest of these classes.
However, the calculation done above implies also that the overall infection fatality rate cannot be arbitrarily large. The reason is that the larger the overall infection fatality rate, the smaller is the number of infected in each age class. When the overall infection fatality rate is too large, the number of infected as computed here would become smaller than the number of positively tested, which is obviously impossible.
Based on this model, we come to the conclusion that the overall infection fatality rate cannot be larger than 5%. At this overall death rate, the death rate of the older age classes would be almost 40%, which reflects the naive calculation proposed in the newspapers.
Regional effects
It would be interesting, if one had time and data available, to perform this estimate region by region and to compare it with random survey (based for instance on large scale random testing). It could perhaps show two effects that are not considered here.
A first effect concerns the social structure of the population. My assumption is that the virus spreads uniformly within the population so that the distribution of the infected across the various age classes reflects the age distribution of the population. Maybe this is not true or it is not true everywhere. There may be regions where the older age classes, for instance, are rather disconnected and less socially active than the younger ones.
A second element concerns the capacity of the healthcare system. When the capacity has reached its maximum, a selection process based on an estimated survival probability may select against the older age classes thus pushing their death rate to higher values than under normal conditions.
All of it could be find out by looking more carefully at the num63r5.
Maybe this was interesting. Maybe not.
In any case, stay healthy and take care.