# Weekly Incidence Including Delay

A few days ago

DATE

I

wrote about some math behind a scenario where you’re trying to identify a new epidemic based on signals proportional to incidence, and ended up deriving:

i ( t ) c ( t ) = k = ln (

2

CARDINAL

) T d

Where:

i ( t ) , is incidence ("how many people are getting sick now")

, is incidence ("how many people are getting sick now") c ( t ) , is cumulative infections ("how many people have gotten sick so far")

, is cumulative infections ("how many people have gotten sick so far") k , is the exponential growth rate.

, is the exponential growth rate. T d , is the doubling time (redundant with k ).

One

CARDINAL

big problem with this model, however, is that any conclusions you make

today

DATE

aren’t driven by current incidence, but instead some kind of delayed incidence. There is, unavoidably, time from infection until you’re making your decision, during which the disease is spreading further:

If your signal is "people arrive at the hospital and the doctors notice a weird cluster", then you need to wait for each infection to progress far enough to result in hospitalization. How do the conclusions of the previous post change if we extend our model to account for a delay, but keep the goal of flagging an epidemic before

1%

PERCENT

of people have been infected?

Recall that last time we estimated that, for something doubling

weekly

DATE

, when

1%

PERCENT

of people have ever been infected then

0.69%

PERCENT

of people became infected in

the last seven days

DATE

. With a delay of

a week

DATE

, however, by the time we learn that incidence has hit

0.69%

PERCENT

many more people will have been infected and we’d have missed our

1%

PERCENT

goal by a lot. The effect of delay is that during this time the epidemic will make further progress, which will depend on the growth rate: with a shorter doubling period there will be more progress. Can we get an equation relating cumulative infections to delayed incidence?

Let’s call delay d . Instead of i ( t ) c ( t ) we now want i ( t – d ) c ( t ) . That is, what’s the relationship between cumulative infections and what incidence was when the information we’re now getting was derived?

We can do a bit of math:

i ( t – d ) c ( t ) = k e k ( t – d ) e k t = k e – k d =

2

CARDINAL

ln (

2

CARDINAL

) T d e – d T d

What does this look like, for a few different potential delay values?

My main takeaway is that, under these assumptions, delay matters less than I would initially have guessed: the sensitivity you need to design for is driven by the need to catch slow-growing epidemics. For example, even with

up to eleven days

DATE

of delay a system sensitive enough to flag an epidemic that doubles

every four weeks

DATE

is sensitive enough to detect one that doubles

every three days

DATE

.

This isn’t the whole story, though, because many of the actions that you would want to do post-discovery are more urgent with higher growth rates. This means you need to count the delay of your core response (ex: implementing

NPIs

ORG

) in the total delay: you can’t allocate the entire delay budget to the detection system.

Another thing to note is that this is all under the assumption that detection depends on incidence above a threshold, and whether this is actually the case is unclear for several potential systems. For example, with wastewater sequencing detection my current best guess is this primarily would rely on the total number of observations of the pathogen, which would be proportional to (delayed) cumulative infections and not (delayed) incidence. With detection based on cumulative infections minimizing delay matters a lot more.