Skip to content

SARS-CoV-2 vs COVID-19: Demography in Italy

The virus spreads disregarding the barriers of gender and age. But the symptoms are age and gender dependent. How does this enter into the data?

People can say what they want to say, but there is simply no getting around dealing with the actual numbers. Numbers such as the number of cases, number of deaths, needed hospital beds, needed ventilators and witnessing the effects of one response compared with another on this set of variables.

Suzanne Loftus,

Let us look at the Num63r5.


As we basically all know, the coronavirus we are talking about is named SARS-CoV-2. The pulmonary disease that it provokes is called COVID-19.

When the virus has infected one person’s body and the current infection could be detected by some approved test (e.g. swabbing), then this person is currently infected with the SARS-CoV-2. A person who was infected and is no longer infected, is a recovered person. An infected person who died is included in the group of deaths.

Therefore, the total number of infected is made of the currently infected, the recovered, and the dead of COVID-19.

We know – with a good approximation – the number of deaths caused by the sickness COVID-19, as a consequence of the infection by the virus SARS-CoV-2.

For the group of currently infected and recovered we know much less. In fact, the numbers that we have concern the total number of tests performed through swabbing and to a greater detail we know the number of positively tested people, namely people who got the swabbing test and the SARS-CoV-2 virus was detected. The greater detail concerns their regional densities and their age distribution.

The big issue is then the following: we don’t really know the number of currently infected people and we don’t know the number of those who have recovered. The reason is that not everyone has been or could be tested due to the limited capacity of the testing facilities. The tests based on the detection of antibodies may eventually remedy some of these problems.

The number of positively tested people is based on a selection criterion: in most of the countries, only people presenting symptoms and/or at risk have been tested.

Since the symptoms are age-dependent, the demography of the positively tested is obviously very different from the demography of the infected with the SARS-CoV-2 virus.

Demography: positively tested vs population age structure

We can compare the numbers and compute the demography of the positively tested vs the demography of the Italian population. We focus on the men. A similar calculation can be done for the women population.

Age classPositivesPercent of positivesDemographyDeathsPercent of deaths
Age distribution at April 16 in Italy for men. The first columns are the various age classes (the cases of unknown age are excluded). The second column contains all cases of positively tested in each age class so far (I assume that double testing of the same person is excluded). The third column contains the percent of positively tested out of the total of positively tested. The fourth column contains the demographic age-distribution of the men population in Italy at the census 2019. The fifth column contains the number of COVID-19 deaths in each age class and the last column the percent of deaths in each age class with respect to the total number of deaths in men. Data are from ISS and ISTAT.

From the percentages, we see that the age classes until the age of 49 are under-represented in the statistics of the positives whereas the older age classes are over-represented.

My claim is that the age distribution of the infected should rather follow the demography instead of reflecting the percent of the positively tested. The reasons are:

  • 1) The virus does not know the age of the person before infecting the person. Its spread should be uniform in the entire population.
  • 2) There is so far no evidence that the positivity of a test depends on the age.
  • 3) There is evidence that the symptoms are age-dependent.
  • 4) People with symptoms are more likely to be tested than people without symptoms.

The percentages of positive in the various age classes (column three in the above table) reflect a mixture of demography and probability to experience COVID-19 symptoms and it does not reflect the true age-distribution of the infected.

Number of infected and age-dependent death rate

There is some evidence that the overall death rate is possibly below 1%. A few studies do even indicate that the infection fatality rate may be as small as 0.4% (as a reference: for influenza it is 0.01%). Let us still assume that the overall death rate is rather large, i.e. 1%. What we know with a certain accuracy is that the number of deaths in the Italian male population is 13042 (as of April 16, after excluding the cases of unknown age).

With 1% overall infection fatality rate this makes a total number of infected equal to 1304200 men as of April 16 (some corrections may be necessary because the process is not at steady-state, in which case we should say that this number refers to the infected 10 days before. This makes no difference for the calculation that follows).

This number should be distributed in the various age classes based on the demography. Thus we have a new table with the number of infected per age based on the infection fatality rate of 1%:

Age classInfectedDeathsInfection
Fatality Rate
Age-dependent death rate in Italy for men. The first columns are the various age classes (the cases of unknown age are excluded). The second column contains the estimated number of infected based on an overall death rate of 1%, and on the demographic age-distribution. The third column contains the number of COVID-19 deaths in each age class. The fourth column is the ratio (expressed in percentage) of the number of deaths (third column) and the corresponding estimated number of infected (second column). Data are from ISS and ISTAT. Elaboration is original.

We see that the age-dependent infection fatality rate changes a lot across age classes but it takes values that probably reflect more realistically the risk of death in each one of the classes.

Some details of the calculation (skip it!)

In the above table, the number of infected in any age class is computed as follows:

infected (age_class = i) = Total_infected x demographic_proportion (age_class = i)

with some rounding to the next integer. Here, we also have

Total_infected = Total_deaths/infection_fatality_rate.

The age-dependent fatality rate is then computed as

fatality_rate (age_class = i) = deaths (age_class = i)/infected (age_class = i).

Based on this calculation, the age-dependent fatality rate grows linearly with the infection fatality rate with a coefficient that is also age-dependent:

fatality_rate (age_class = i) = [death_proportion (age_class = i)/demographic_proportion (age_class = i)] x infection_fatality_rate .

Upper limit to the overall death rate

The age-dependent death rates would obviously change as the overall death rate changes. So for instance with an overall death rate of 0.4%, the death rate for the age class above 80 is 3%. With an overall death rate of 2% instead, the death rate of the older age classes becomes 15%.

The above calculation indicates that assessing the overall infection fatality rate is crucial to assess the real risk of each age class especially for the weakest of these classes.

However, the calculation done above implies also that the overall infection fatality rate cannot be arbitrarily large. The reason is that the larger the overall infection fatality rate, the smaller is the number of infected in each age class. When the overall infection fatality rate is too large, the number of infected as computed here would become smaller than the number of positively tested, which is obviously impossible.

Based on this model, we come to the conclusion that the overall infection fatality rate cannot be larger than 5%. At this overall death rate, the death rate of the older age classes would be almost 40%, which reflects the naive calculation proposed in the newspapers.

Regional effects

It would be interesting, if one had time and data available, to perform this estimate region by region and to compare it with random survey (based for instance on large scale random testing). It could perhaps show two effects that are not considered here.

A first effect concerns the social structure of the population. My assumption is that the virus spreads uniformly within the population so that the distribution of the infected across the various age classes reflects the age distribution of the population. Maybe this is not true or it is not true everywhere. There may be regions where the older age classes, for instance, are rather disconnected and less socially active than the younger ones.

A second element concerns the capacity of the healthcare system. When the capacity has reached its maximum, a selection process based on an estimated survival probability may select against the older age classes thus pushing their death rate to higher values than under normal conditions.

All of it could be find out by looking more carefully at the num63r5.

Maybe this was interesting. Maybe not.

In any case, stay healthy and take care.

Leave a Reply

Your email address will not be published. Required fields are marked *