The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
hhdynamics fits a Bayesian household transmission model using MCMC to estimate:
The model accounts for tertiary transmission (household contacts infecting other household contacts) and uses a serial interval distribution to link infection times.
Input data must be in long format (one row per individual) with these required columns:
| Column | Description |
|---|---|
hhID |
Household identifier |
member |
Member index within household (0 = index case, 1, 2, … = contacts) |
size |
Number of individuals in the household |
end |
End date of follow-up for that individual |
inf |
Infection status (1 = infected, 0 = not infected). Index cases must be infected. |
onset |
Onset time of symptoms (use -1 or NA for uninfected) |
Additional columns can be included as covariates for infectivity or susceptibility.
library(hhdynamics)
data("inputdata")
str(inputdata)
#> 'data.frame': 1533 obs. of 8 variables:
#> $ hhID : int 1 1 1 100 100 100 100 10005 10005 10005 ...
#> $ member: int 0 1 2 0 1 2 3 0 1 2 ...
#> $ size : int 3 3 3 4 4 4 4 3 3 3 ...
#> $ end : int 15 15 15 66 66 66 66 922 922 922 ...
#> $ inf : int 1 0 0 1 0 0 0 1 0 0 ...
#> $ onset : int 8 NA NA 57 NA NA NA 913 NA NA ...
#> $ age : Factor w/ 3 levels "0","1","2": 1 1 3 2 1 1 3 1 3 2 ...
#> $ sex : Factor w/ 2 levels "0","1": 2 2 2 1 2 1 1 1 2 2 ...
head(inputdata)
#> hhID member size end inf onset age sex
#> 1 1 0 3 15 1 8 0 1
#> 2 1 1 3 15 0 NA 0 1
#> 3 1 2 3 15 0 NA 2 1
#> 4 100 0 4 66 1 57 1 0
#> 5 100 1 4 66 0 NA 0 1
#> 6 100 2 4 66 0 NA 0 0The model uses a discrete serial interval distribution (probability mass function of length 14, one value per day). By default, the bundled influenza serial interval from Tsang et al. (2014) is used:
data("SI")
print(SI)
#> [1] 5.872190e-02 3.155084e-01 4.863670e-01 1.369000e-01 2.502287e-03
#> [6] 3.537313e-07 1.257039e-14 0.000000e+00 0.000000e+00 0.000000e+00
#> [11] 0.000000e+00 0.000000e+00 0.000000e+00 0.000000e+00
barplot(SI, names.arg = 1:14, xlab = "Days", ylab = "Probability",
main = "Default serial interval distribution (influenza)")You can provide your own SI vector, or let the model estimate it from data (see below).
Use household_dynamics() to fit the model. Specify
covariates using R formulas:
# With covariates (uses default influenza SI)
fit <- household_dynamics(inputdata, inf_factor = ~sex, sus_factor = ~age,
n_iteration = 50000, burnin = 10000, thinning = 10)
# Without covariates
fit <- household_dynamics(inputdata,
n_iteration = 50000, burnin = 10000, thinning = 10)
# With a custom serial interval
my_SI <- c(0, 0.01, 0.05, 0.15, 0.25, 0.25, 0.15, 0.08, 0.04, 0.015, 0.005, 0, 0, 0)
fit <- household_dynamics(inputdata, SI = my_SI,
n_iteration = 50000, burnin = 10000, thinning = 10)The function returns an hhdynamics_fit object containing
the full MCMC output.
print(fit)
# Household transmission model fit
# Data: 386 households, 1533 individuals
# MCMC: 4000 post-burnin samples (50000 iterations, burnin: 10000, thin: 10)
# Runtime: 85 seconds
# Parameters: community, household, sex1.0, age1.0, age2.0
#
# Use summary() for estimates, coef() for posterior means.summary() returns a table with posterior means, 95%
credible intervals, and exponentiated estimates for covariate
effects:
summary(fit)
# Variable Point estimate Lower bound Upper bound ...
# Daily probability of infection from community 0.004 0.002 0.007
# Probability of person-to-person transmission in households 0.057 0.034 0.084
# sex1.0 -0.081 -0.733 0.488
# age1.0 -0.065 -0.537 0.412
# age2.0 -0.312 -0.831 0.170Community and household parameters are reported on the probability
scale (via 1 - exp(-x) transform). Covariate effects are on
the log scale, with exponentiated values also shown.
The fit object stores all MCMC output for custom diagnostics and post-processing:
# Posterior samples matrix (post-burnin, thinned)
dim(fit$samples) # e.g. 4000 x 7
# Log-likelihood trace (full chain, for convergence checks)
dim(fit$log_likelihood) # e.g. 50000 x 3
# Per-parameter acceptance rates
fit$acceptance
# Trace plot for community parameter
plot(fit$samples[, "community"], type = "l",
ylab = "Community rate", xlab = "Iteration")
# Posterior density
hist(fit$samples[, "household"], breaks = 50, probability = TRUE,
main = "Household transmission probability (raw scale)",
xlab = "Rate parameter")hhdynamics provides several plotting functions for diagnostics and publication-ready figures.
# Forest plot of covariate effects (relative risks)
plot_covariates(fit)
# With custom labels for variable headers and factor levels
plot_covariates(fit, file = "covariates.pdf",
labels = list(
sex = list(name = "Sex", levels = c("Male", "Female")),
age = list(name = "Age Group", levels = c("0-5", "6-17", "18+"))))The file parameter auto-sizes the PDF dimensions based
on the number of covariates.
# Overall attack rate
plot_attack_rate(fit)
# Stratified by a single variable
plot_attack_rate(fit, by = ~age)
# Combine overall + multiple strata in one figure
plot_attack_rate(fit, by = list(~sex, ~age), include_overall = TRUE,
labels = list(
sex = list(name = "Sex", levels = c("Male", "Female")),
age = list(name = "Age Group", levels = c("0-5", "6-17", "18+"))))hhdynamics includes table functions that return data frames suitable for reporting.
household_dynamics() validates inputs before running the
MCMC, catching common errors early:
# Missing column
household_dynamics(inputdata[, -1])
# Error: 'input' is missing required columns: hhID
# Wrong formula variable
household_dynamics(inputdata, inf_factor = ~nonexistent)
# Error: Variables in inf_factor not found in input: nonexistent
# Old string syntax (no longer supported)
household_dynamics(inputdata, inf_factor = "~sex")
# Error: 'inf_factor' must be a formula (e.g. ~sex) or NULL.Infected household contacts with missing (NA) onset
times are automatically imputed during MCMC. The sampler proposes onset
times uniformly within the follow-up window and accepts/rejects via the
full likelihood. Index case onsets must be non-missing.
# Some infected contacts have unknown symptom onset
inputdata_missing_onset <- inputdata
infected_contacts <- which(inputdata$member > 0 & inputdata$inf == 1)
inputdata_missing_onset$onset[infected_contacts[1:5]] <- NA
fit_onset <- household_dynamics(inputdata_missing_onset, ~sex, ~age,
n_iteration = 50000, burnin = 10000, thinning = 10)
# Note: 5 of 92 infected contact(s) (5.4%) have missing onset times. These will be imputed during MCMC.household_dynamics() also imputes missing factor
covariates via Bayesian data augmentation. Simply pass your data with
NA values in factor columns:
# Introduce some missing values
inputdata_missing <- inputdata
inputdata_missing$sex[c(5, 12, 30)] <- NA
inputdata_missing$age[c(8, 20)] <- NA
# Fit as usual — missing values are imputed automatically
fit_missing <- household_dynamics(inputdata_missing, ~sex, ~age,
n_iteration = 50000, burnin = 10000, thinning = 10)
# Note: Covariate 'sex' has 3 missing value(s) (0.2%). These will be imputed during MCMC.
# Note: Covariate 'age' has 2 missing value(s) (0.1%). These will be imputed during MCMC.
summary(fit_missing)Restrictions:
NA will produce an error.~sex*age) are not supported when
covariates have missing data.inf_factor and
sus_factor if it has missing values.If the serial interval is unknown or you want to estimate it jointly
with the transmission parameters, set estimate_SI = TRUE.
The model parameterizes the SI as a Weibull distribution and estimates
the shape and scale parameters via MCMC:
fit_si <- household_dynamics(inputdata, inf_factor = ~sex, sus_factor = ~age,
n_iteration = 50000, burnin = 10000, thinning = 10,
estimate_SI = TRUE)
summary(fit_si)
# Output now includes si_shape and si_scale parameters
# Reconstruct the estimated SI distribution
est_shape <- mean(fit_si$samples[, "si_shape"])
est_scale <- mean(fit_si$samples[, "si_scale"])
est_SI <- pweibull(2:15, est_shape, est_scale) - pweibull(1:14, est_shape, est_scale)
barplot(est_SI, names.arg = 1:14, xlab = "Days", ylab = "Probability",
main = "Estimated serial interval")The Weibull priors are: shape ~ Uniform(0.1, 10), scale ~
Uniform(0.1, 20). When estimate_SI = TRUE, the
SI argument is not used.
simulate_data() generates synthetic datasets for
validation or power analysis:
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.