virophysics

Code

# COVID-19: A closer look

Hi there!

Curious about the math behind infection spread? I've posted a quick intro below.

## Will we get out of this? If so, how?

Unless a reasonably effective vaccine and/or antiviral therapy for SARS-CoV-2 can be found, our options are:
• Fully open and let the infection spread freely. This would result in at least 2 million dead Canadians, possibly more because, if we want to do this fast, we would have to allow our healthcare system to be overwhelmed which means non-COVID-19 deaths (car accidents, cancer, major cardiovascular events, etc.) would also go way up.

• Close/Open/Close/Open until we achieve herd immunity, i.e. wherein most people have gotten the virus and our immunity makes it such that we don't get very sick if we get infected again and we have some immune protection against re-infection, and virus transmission and mortality rate reaches more flu-like levels. Over all of March to July 2020, about 117,000 people have become infected with SARS-CoV-2 Canada-wide: That's about 0.3% of the population infected over 5 months. At this rate, it will take 135 years to infect the entire Canadian population. Since immunity probably doesn't last forever, at this rate we could never achieve herd immunity.

• Strictly keep case counts at/near zero with intensive case tracing. In this scenario, we have a strict, complete shutdown until we bring the case count to zero and have 2 full weeks with no new cases. Then we re-open fully (normal life resumes, no stages), but with comprehensive case tracing. As soon as there is any indication that case tracing has failed (new infections cannot be traced to a specific cluster/infected), we fully close immediately so that the closure will not have to be long (the more we wait, the longer we would need to close for) and can be fairly localized (to catch the entire possible transmission chain for all cases of unknown sources).

### Counts of new cases and deaths each day

C stands for the daily new Cases (red), and D for the daily Deaths (black). The dots are the data, the (not always visible) red line is the smoothed data (Gaussian kernel, $$\sigma$$=2-4 days, applied to the log of the daily case count). The blue line is a set of linear segments identified via the method described by V. Muggeo, each corresponding to a period with a given rate of increase/decrease of daily counts, separated by vertical red dashed lines, likely corresponding to the start/end of certain public health measures. The black and blue lines for the daily death counts are the red and blue lines for the daily case counts shifted by eye by a # of days and multiplied by a percentage, indicated in the graph title. These #s suggest Japan (12 days between Case identified to Death, 5% of Cases result in Death) is possibly doing better than France (7 days Case to Death, 18% of Cases result in Death), because France seemingly takes 5 days more to report infected individuals as cases, and identifies 3-4 times fewer infected individuals as cases, assuming a similar likelihood of death from infection, which might not be correct.

### World scale

Data from:
World: https://covid.ourworldindata.org/data/owid-covid-data.csv
Japanese prefectures: https://github.com/reustle/covid19japan-data

### The math behind epidemiological infection spread

Simplistically, one can imagine the (unconstrained) growth rate early in the epidemic when everyone is susceptible is

$N(t) = N(0) \cdot R_0^{t/t_R}$

where $$N(t)$$ is the # of people infected at time $$t$$, if there was $$N(0)$$ people infected at time $$t=0$$, and if each person infects $$R_0$$ other people, and if the time between when you are infected and when you are no longer infectious (e.g. because you have died or recovered) is $$t_R$$. Taking the $$\log_{10}$$ on both sides you get:

\begin{align*} \log_{10}[ N(t) ] &= \log_{10}[N(0)] + \frac{\log_{10}[R_0]}{t_R}\ t \\ y &= b + m x \end{align*}

You can perform a linear regression of $$y=\log_{10}[N(t)]$$ versus $$x=t$$, and you'll find that $$b=\log_{10}[N(0)]$$ is your $$y$$-intercept, and $$m=\log_{10}[R_0]/t_R$$ is your slope, as in $$y = b+mx$$. This means that $$10^{m} = R_0^{(1\,\mathrm{day})/t_R}$$ where $$t_R$$ is in units of days, and then $$10^m$$ is such that

$N(\mathrm{one\ day\ later}) = N(\mathrm{now}) \cdot 10^m$

So if $$10^m = 1.2$$, it corresponds to a growth rate of 20%/day which is ($$10^m - 1)\cdot100$$%. You can use similar reasoning to compute the doubling time, the time $$t_\mathrm{doubling}$$ (in days) such that $$R_0^{[t_\mathrm{doubling}/t_R]} = 2$$. So $$t_\mathrm{doubling} = 2 t_R / \log_{10}(R_0)$$.

Now what is this $$R_0$$ all about? It is called the basic reproductive number (see the movie Contagion or Wikipedia). It is the number of people one infected person will infect over the period of time for which they were infectious, i.e. from the time they became infectious to the time they no longer are, if everyone they encounter is susceptible, assuming some average rate of encounter. But as we impose physical distancing measures, an infected person will encounter fewer people per day, and thus will not infect as many people during their infectious period. Say physical distancing measures mean you encounter half as many people as you used to, then you get something like:

\begin{align*} N(t) &= N(0) \left(\frac{R_0}{2}\right)^{t/t_R} \\ m &= \frac{ \log_{10}\left[R_0/2 \right] }{t_R} \end{align*}

so your slope suddenly changes when the effect of the physical distancing measures appears in the case counts, approximately a time $$t_R$$ after it is introduced. So you can fit the data as a set of linear segments, with one linear regression for each segment corresponding to a new set of physical distancing measures. You can also estimate how effective the measure was by comparing the growth rate before and after the bend: that's what flattening the curve means.

Similarly, as the pandemic progresses, more and more of the people you encounter have already been infected, i.e. they have recovered, and they are presumably no longer susceptible because they have acquired (temporary?) immunity. This has an effect similar to physical distancing in that it reduces how many people you can infect over your infectious period, something like

$N(t) = N(0) \left[ R_0 \left(1 - \frac{C(t)}{\mathrm{Total}} \right) \right]^{t/t_R}$

where Total is the total population (e.g. of Ontario) so $$[1 - C(t)/\mathrm{Total}]$$ is the fraction of the population that is still susceptible (has not yet been infected), and $$C(t)$$ is the cumulative number of all people infected thus far. So if $$2/3$$ of the population has already been infected, you can only infect $$1/3$$ as many people as you would have at the start of the pandemic when everyone was susceptible, $$R_0/3$$.