Imperfect serological test

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Imperfect test

Function correct_prevalence() is used for estimating the true prevalence if the serological test used is imperfect

Arguments:

data the input data frame, must either have:
- age, pos, tot columns (for aggregated data)
- OR age, status columns for (linelisting data)
bayesian whether to adjust sero-prevalence using the Bayesian or frequentist approach. If set to TRUE, true sero-prevalence is estimated using MCMC.
init_se sensitivity of the serological test (default value 0.95)
init_sp specificity of the serological test (default value 0.8)
study_size_se (applicable when bayesian=TRUE) sample size for sensitivity validation study (default value 1000)
study_size_sp (applicable when bayesian=TRUE) sample size for specificity validation study (default value 1000)
chains (applicable when bayesian=TRUE) number of Markov chains (default to 1)
warmup (applicable when bayesian=TRUE) number of warm up runs (default value 1000)
iter (applicable when bayesian=TRUE) number of iterations (default value 2000)

The function will return a list of 2 items:

info
- if bayesian = TRUE contains estimated values for se, sp and corrected seroprevalence
- else return the formula for computing corrected seroprevalence
corrected_sero return a data.frame with age, sero (corrected sero) and pos, tot (adjusted based on corrected prevalence)

# ---- estimate real prevalence using Bayesian approach ----
data <- rubella_uk_1986_1987
output <- correct_prevalence(data, warmup = 1000, iter = 4000, init_se=0.9, init_sp = 0.8, study_size_se=1000, study_size_sp=3000)
#> 
#> SAMPLING FOR MODEL 'prevalence_correction' NOW (CHAIN 1).
#> Chain 1: 
#> Chain 1: Gradient evaluation took 3.7e-05 seconds
#> Chain 1: 1000 transitions using 10 leapfrog steps per transition would take 0.37 seconds.
#> Chain 1: Adjust your expectations accordingly!
#> Chain 1: 
#> Chain 1: 
#> Chain 1: Iteration:    1 / 4000 [  0%]  (Warmup)
#> Chain 1: Iteration:  400 / 4000 [ 10%]  (Warmup)
#> Chain 1: Iteration:  800 / 4000 [ 20%]  (Warmup)
#> Chain 1: Iteration: 1001 / 4000 [ 25%]  (Sampling)
#> Chain 1: Iteration: 1400 / 4000 [ 35%]  (Sampling)
#> Chain 1: Iteration: 1800 / 4000 [ 45%]  (Sampling)
#> Chain 1: Iteration: 2200 / 4000 [ 55%]  (Sampling)
#> Chain 1: Iteration: 2600 / 4000 [ 65%]  (Sampling)
#> Chain 1: Iteration: 3000 / 4000 [ 75%]  (Sampling)
#> Chain 1: Iteration: 3400 / 4000 [ 85%]  (Sampling)
#> Chain 1: Iteration: 3800 / 4000 [ 95%]  (Sampling)
#> Chain 1: Iteration: 4000 / 4000 [100%]  (Sampling)
#> Chain 1: 
#> Chain 1:  Elapsed Time: 0.174 seconds (Warm-up)
#> Chain 1:                0.346 seconds (Sampling)
#> Chain 1:                0.52 seconds (Total)
#> Chain 1:

# check fitted value 
output$info[1:2, ]
#>             mean      se_mean          sd      2.5%       25%       50%
#> est_se 0.9276088 0.0001003224 0.005998519 0.9157822 0.9236570 0.9276923
#> est_sp 0.8027202 0.0001009184 0.006876397 0.7887838 0.7981318 0.8028880
#>              75%     97.5%    n_eff      Rhat
#> est_se 0.9316999 0.9391490 3575.135 0.9997911
#> est_sp 0.8074489 0.8156825 4642.808 0.9997088

# ---- estimate real prevalence using frequentist approach ----
freq_output <- correct_prevalence(data, bayesian = FALSE, init_se=0.9, init_sp = 0.8)

# check info
freq_output$info
#> [1] "Formula: real_sero = (apparent_sero + sp - 1) / (se + sp -1)"

# compare original prevalence and corrected prevalence
ggplot()+
  geom_point(aes(x = data$age, y = data$pos/data$tot, color="apparent prevalence")) + 
  geom_point(aes(x = output$corrected_se$age, y = output$corrected_se$sero, color="estimated prevalence (bayesian)" )) +
  geom_point(aes(x = freq_output$corrected_se$age, y = freq_output$corrected_se$sero, color="estimated prevalence (frequentist)" )) +
  scale_color_manual(
    values = c(
      "apparent prevalence" = "red", 
      "estimated prevalence (bayesian)" = "blueviolet",
      "estimated prevalence (frequentist)" = "royalblue")
  )+ 
  labs(x = "Age", y = "Prevalence")

Fitting corrected data

Data after seroprevalence correction

Bayesian approach

suppressWarnings(
  corrected_data <- farrington_model(
  output$corrected_se,
  start=list(alpha=0.07,beta=0.1,gamma=0.03))
)

plot(corrected_data)
#> Warning: No shared levels found between `names(values)` of the manual scale and the
#> data's fill values.

Frequentist approach

suppressWarnings(
  corrected_data <- farrington_model(
  freq_output$corrected_se,
  start=list(alpha=0.07,beta=0.1,gamma=0.03))
)

plot(corrected_data)
#> Warning: No shared levels found between `names(values)` of the manual scale and the
#> data's fill values.

Original data

suppressWarnings(
  original_data <- farrington_model(
  data,
  start=list(alpha=0.07,beta=0.1,gamma=0.03))
)
plot(original_data)
#> Warning: No shared levels found between `names(values)` of the manual scale and the
#> data's fill values.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.