The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
This vignette demonstrates the AMISforInfectiousDiseases
package through a case study considering Ascariasis in Ethiopia in a
single time point.
Ascariasis is one of the most common helminth infections in people worldwide. This disease is caused by Ascaris lumbricoides, also known as roundworms, which are an intestinal nematode. Ascaris surveys for Ethiopia are available upon request from the Global Atlas of Helminth Infection developed by the London Applied & Spatial Epidemiology Research Group (LASER) (Rachel L et al. 2014). Retkute et al. (2021) used these surveys to obtain national level prevalence maps and made them available as a supplementary material. We will use these maps, which consist of \(M=100\) samples of a prevalence map for Ascaris in Ethiopia with 11,369 locations at 10km \(\times\) 10km resolution.
In this example, we will use a subset of 1,000 locations and convert the percentage prevalence values to their decimals:
ascariasis_url <- "https://projecteuclid.org/journals/supplementalcontent/10.1214/21-AOAS1486/aoas1486suppb.zip"
tempFile <- tempfile()
download.file(ascariasis_url, tempFile)
tempFolder <- tempdir()
unzip(tempFile, exdir=tempFolder)
unlink(tempFile)
load(paste0(tempFolder, "/AMISEpi-main/data/prev.rda"))
L <- 1000 # taking a subset of L locations
wh <- sample(1:nrow(prev), L)
ascaris_map <- prev[wh, ]/100
Following Anderson and May (1991), we assume that the number of worms per host, \(X\), has negative binomial distribution NB(\(W\),\(k\)), where \(W\) is the mean number of worms and \(k\) the dispersion parameter describing the degree of clumping of parasites. The probability mass function is given by \[\begin{equation} \text{Pr}(X=x) = \frac{\Gamma(x+k)}{\Gamma(k)x!} \left(\frac{k}{k+W}\right)^k \left(\frac{W}{k+W}\right)^x, \quad x=0,1,2,\dots. \end{equation}\]
Therefore, the prevalence is given by the probability of observing worms:
\[\begin{align} P &= 1 - \text{Pr}\big(X=0 \ \vert \ W, k \big)\\ &= 1 - \left( 1 + \frac{W}{k} \right)^{-k}. \end{align}\]
By sampling \(W\) in log-scale, the
disease model passed to function amis()
can be written as
follows:
The priors from Retkute et al. (2021) for the model parameters are borrowed in this example. These priors reflect what is currently known about the interaction between the worm burden and clumping coefficient based on the literature. In summary, a uniform prior is placed on the log worm burden and the relationship between the clumping coefficient and worm burden is estimated through a linear regression. Formally this is \[\begin{align} \log(W) &\sim U\big(\log(0.01), \log(60)\big), \\ k | W &\sim N\left(\hat{\alpha}+ \hat{\beta} W, \sigma(W)^2\right), \\ \end{align}\] where \(\hat{\alpha}\) and \(\hat{\beta}\) are the intercept and slope coefficients respectively for the regression line of worm burden against the clumping coefficient, and \(\sigma(W)\) is five times the standard error from this analysis. Refer to Retkute et al. (2021) for further details on the prior selection.
The prior object passed to amis()
is given by
fit.v <- c(0.33371009, 0.01719462)
fit.inflation.factor <- 5
fit.hess<-matrix(0, nrow = 2, ncol = 2)
fit.hess[[1,1]] <- 5138.97
fit.hess[[1,2]] <- 49499.4
fit.hess[[2,1]] <- 49499.4
fit.hess[[2,2]] <- 677831
rprior <- function(n) {
params <- matrix(NA,n,2)
colnames(params) <- c("logW","k")
params[,1] <- runif(n,log(0.01),log(60))
params[,2] <- sapply(1:n, function(i) {
rnorm(1, mean = fit.v[1] + fit.v[2]*exp(params[i,1]),
sd = fit.inflation.factor*sqrt(c(1,exp(params[i,1]))%*%solve(fit.hess,c(1,exp(params[i,1])))))
})
return(params)
}
dprior <- function(x, log=FALSE) {
if (log) {
dlogW <- dunif(x[1],log(0.01),log(60),log=TRUE)
dk <- dnorm(x[2], mean = fit.v[1] + fit.v[2]*exp(x[1]),
sd = fit.inflation.factor*sqrt(c(1,exp(x[1]))%*%solve(fit.hess,c(1,exp(x[1])))),log=TRUE)
return(sum(dlogW,dk))
} else {
dlogW <- dunif(x[1],log(0.01),log(60))
dk <- dnorm(x[2], mean = fit.v[1] + fit.v[2]*exp(x[1]),
sd = fit.inflation.factor*sqrt(c(1,exp(x[1]))%*%solve(fit.hess,c(1,exp(x[1])))))
return(prod(dlogW,dk))
}
}
ascaris_prior <- list(rprior=rprior, dprior=dprior)
The list of default control parameters can be seen in
default_amis_params()
. This example uses the Gaussian
kernel in the nonparametric estimation of the probability density \(f\). To use the Gaussian kernel, one needs
to define sigma
in the object amis_params
. In
addition, we will will set model parameter boundaries to ensure that
\(-\infty < \log(W) < \infty\)
and \(k>0\).
amis_params <- default_amis_params()
amis_params$boundaries_param <- matrix(c(-Inf, Inf, 0, Inf), 2, 2, byrow=TRUE)
amis_params$sigma <- 0.0025
Note that instead of defining boundaries through the object
amis_params
, the user could alternatively set the
prevalence to NA
directly in the model when \(k\) is negative and thus out of bounds. The
boundaries for prevalence values, which are in \([0,1]\), are taken into account by default
(through amis_params$boundaries
).
Now we have all the objects we need to run the AMIS algorithm:
Given the output returned by amis()
, we can use the
function print()
to see the data dimensions, model choices,
and control parameters used to run the AMIS algorithm. The function
summary()
provides some brief information about the number
of samples and effective sample sizes achieved by the algorithm.
print(out)
#> Data dimensions:
#> - Number of time points: 1
#> - Number of locations: 1000
#> - Number of map samples in each location: 100
#> -------------------------------------------------------------
#> Model and algorithm specifications:
#> - For the nonparametric estimation of the density of the likelihood:
#> Gaussian kernel was used with smoothing parameter sigma = 0.0025
#> - Lower and upper boundaries for prevalences: 0, 1
#> - Number of new samples drawn within each AMIS iteration: 500
#> - Maximum number of iterations: 12
#> - Target effective sample size: 500
summary(out)
#> Fitted model:
#> - Number of iterations: 6
#> - Total number of simulated samples: 3000
#> - Target effective sample size: 500
#> Number of locations whose ESS exceeded the target ESS: 1000
#> Number of locations whose ESS was lower the target ESS: 0
It is possible to look at the parameter samples proposed at a given iteration (by default, the last iteration) and the corresponding proposal density:
par(mfrow=c(1,2))
plot_mixture_components(out, what = "uncertainty", cex=3)
plot_mixture_components(out, what = "density", nlevels=200)
For two arbitrary locations, we show below the posterior distribution of weighted samples for the model parameters and corresponding simulated prevalences.
locs <- sample(1:L, 2)
par(mfrow=c(3,2))
plot(out, what = "logW", type="hist", locations=locs, breaks = 50)
plot(out, what = "k", type="hist", locations=locs, breaks = 50)
plot(out, what = "prev", type="hist", locations=locs, time=1, breaks = 50)
We can also produce plots of credible intervals for these statistics for all locations as follows:
par(mfrow=c(1,3))
plot(out, what=c("logW","k","prev"), type="CI", locations=1:L,
cex=0.1, lwd=0.1, measure_central="median", order_locations_by="prev")
To obtain the numerical values of summary statistics for specific locations, we can run
calculate_summaries(out, what="prev", time=1, locations=locs, alpha=0.05)
#> $mean
#> iu345 iu364
#> 0.1421 0.2362
#>
#> $median
#> iu345 iu364
#> 0.1288 0.2321
#>
#> $quantiles
#> iu345 iu364
#> 2.5% 0.01171 0.0312
#> 97.5% 0.29781 0.4146
#>
#> $exceedance_probability
#> iu345 iu364
#> 0.01447 0.07241
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.