The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Example to estimate incubation period

Flavio Finger

2023-01-13

Description

This package contains two functions useful to compute the incubation period distribution from outbreak data. The inputs needed for each patient are given as a data.frame or linelist object and must contain:

The function empirical_incubation_dist() computes the discrete probability distribution by giving equal weight to each patient. Thus, in the case of N patients, the n possible exposure dates of a given patient get the overall weight 1/(n*N). The function returns a data frame with column incubation_period containing the different incubation periods with a time step of one day and their relative_frequency.

The function fit_gamma_incubation_dist() takes the same inputs, but directly samples from the empirical distribution and fits a discrete gamma distribution to it by the means of fit_disc_gamma.

Example

Load environment:

library(magrittr)
library(epitrix)
library(distcrete)
library(ggplot2)

Make a linelist object containing toy data with several possible exposure dates for each case:

ll <- sim_linelist(15)

x <- 0:15
y <- distcrete("gamma", 1, shape = 12, rate = 3, w = 0)$d(x)
mkexposures <- function(i) {
  i - sample(x, size = sample.int(5, size = 1), replace = FALSE, prob = y)
}
exposures <- sapply(ll$date_of_onset, mkexposures)
ll$dates_exposure <- exposures

print(ll)
#>    id date_of_onset date_of_report gender  outcome
#> 1   1    2020-01-23     2020-02-01   male recovery
#> 2   2    2020-02-14     2020-02-18   male    death
#> 3   3    2020-01-25     2020-01-29 female recovery
#> 4   4    2020-01-16     2020-01-30   male recovery
#> 5   5    2020-01-22     2020-01-28   male    death
#> 6   6    2020-01-26     2020-01-31   male recovery
#> 7   7    2020-02-09     2020-02-16 female recovery
#> 8   8    2020-02-17     2020-02-24 female recovery
#> 9   9    2020-01-14     2020-01-20   male recovery
#> 10 10    2020-02-22     2020-03-12   male recovery
#> 11 11    2020-02-26     2020-03-04   male recovery
#> 12 12    2020-01-06     2020-01-10   male recovery
#> 13 13    2020-02-23     2020-02-29 female recovery
#> 14 14    2020-01-08     2020-01-16 female recovery
#> 15 15    2020-01-21     2020-01-26   male recovery
#>                       dates_exposure
#> 1                       18281, 18280
#> 2                       18303, 18305
#> 3                              18282
#> 4  18274, 18273, 18275, 18272, 18271
#> 5                              18279
#> 6                              18281
#> 7                       18297, 18298
#> 8         18306, 18304, 18305, 18307
#> 9                       18270, 18272
#> 10 18308, 18311, 18310, 18313, 18312
#> 11 18315, 18316, 18314, 18317, 18313
#> 12        18264, 18263, 18265, 18262
#> 13        18313, 18312, 18310, 18309
#> 14               18264, 18265, 18266
#> 15        18279, 18277, 18280, 18278

Empirical distribution:

incubation_period_dist <- empirical_incubation_dist(ll, date_of_onset, dates_exposure)
print(incubation_period_dist)
#> # A tibble: 7 × 2
#>   incubation_period relative_frequency
#>               <dbl>              <dbl>
#> 1                 0              0    
#> 2                 1              0.06 
#> 3                 2              0.107
#> 4                 3              0.262
#> 5                 4              0.312
#> 6                 5              0.149
#> 7                 6              0.11

ggplot(incubation_period_dist, aes(incubation_period, relative_frequency)) +
  geom_col()

Fit discrete gamma:

fit <- fit_gamma_incubation_dist(ll, date_of_onset, dates_exposure)
print(fit)
#> $mu
#> [1] 4.229868
#> 
#> $cv
#> [1] 0.32265
#> 
#> $sd
#> [1] 1.364767
#> 
#> $ll
#> [1] -1729.577
#> 
#> $converged
#> [1] TRUE
#> 
#> $distribution
#> A discrete distribution
#>   name: gamma
#>   parameters:
#>     shape: 9.60586714704713
#>     scale: 0.440342153837883

x = c(0:10)
y = fit$distribution$d(x)
ggplot(data.frame(x = x, y = y), aes(x, y)) +
  geom_col(data = incubation_period_dist, aes(incubation_period, relative_frequency)) +
  geom_point(stat="identity", col = "red", size = 3) +
  geom_line(stat="identity", col = "red")

Note that if the possible exposure dates are consecutive for all patients then empirical_incubation_dist() and fit_gamma_incubation_dist() can take date ranges as inputs instead of lists of individual exposure dates (see help for details).

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.