The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

bayesmsm for longitudinal data with informative right-censoring

Xiao Yan, Kuan Liu

Introduction

Simulated longitudinal observational data with right-censoring

In this vignette, we demonstrate how to simulate a longitudinal dataset that replicates features of real-world clinical studies with right-censoring using simData(). Here, we generate data for 200 individuals observed over 2 visits. At each visit, two normally distributed covariates (L1_j, L2_j) and a binary treatment assignment (A_j) are created. Right-censoring is induced at each visit via a logistic model (C_j), and once an individual is censored at visit j, all subsequent covariates, treatments, and the end-of-study outcome Y are set to NA. The outcome Y is binary, drawn from a logistic regression on the full covariate and treatment history.

Variable Description
L1_1, L2_1 Baseline covariates (continuous)
L1_2, L2_2 Time-varying covariates at visit 2
A1, A2 Binary treatments at visits 1 and 2
C1, C2 Right-censoring indicators at each visit
Y Binary end-of-study outcome (1 = event)
# 1) Define coefficient lists for 2 visits
amodel <- list(
  # Visit 1: logit P(A1=1) = -0.3 + 0.4*L1_1 - 0.2*L2_1
  c("(Intercept)" = -0.3, "L1_1" = 0.4, "L2_1" = -0.2),
  # Visit 2: logit P(A2=1) = -0.1 + 0.3*L1_2 - 0.1*L2_2 + 0.5*A_prev
  c("(Intercept)" = -0.1, "L1_2" = 0.3, "L2_2" = -0.1, "A_prev" = 0.5)
)

# 2) Define outcome model: logistic on treatments and last covariates
ymodel <- c(
  "(Intercept)" = -0.8,
  "A1"          = 0.2,
  "A2"          = 0.4,
  "L1_2"        = 0.3,
  "L2_2"        = -0.3
)

# 3) Define right-censoring models at each visit
cmodel <- list(
  # Censor at visit 1 based on baseline covariates and A1
  c("(Intercept)" = -1.5, "L1_1" = 0.2, "L2_1" = -0.2, "A" = 0.2),
  # Censor at visit 2 based on visit-2 covariates and A2
  c("(Intercept)" = -1.5, "L1_2" = 0.1, "L2_2" = -0.1, "A" = 0.3)
)

# 4) Load package and simulate data
simdat_cen <- simData(
  n                = 100,
  n_visits         = 2,
  covariate_counts = c(2, 2),
  amodel           = amodel,
  ymodel           = ymodel,
  y_type           = "binary",
  right_censor     = TRUE,
  cmodel           = cmodel,
  seed             = 123
)

# 5) Inspect first rows
head(simdat_cen)
#>          L1_1        L2_1 A1 C1       L1_2        L2_2 A2 C2  Y
#> 1 -0.56047565 -0.71040656  1  0 -0.7152422 -0.07355602  1  1 NA
#> 2 -0.23017749  0.25688371  0  0 -0.7526890 -1.16865142  1  0  1
#> 3  1.55870831 -0.24669188  0  0 -0.9385387 -0.63474826  0  0  0
#> 4  0.07050839 -0.34754260  1  0 -1.0525133 -0.02884155  0  0  1
#> 5  0.12928774 -0.95161857  0  0 -0.4371595  0.67069597  1  1 NA
#> 6  1.71506499 -0.04502772  1  0  0.3311792 -1.65054654  1  0  0

Bayesian treatment effect weight estimation using bayesweight_cen

Next, we use the bayesweight_cen function to estimate the weights with censoring. We specify the treatment and censoring models for each time point, including the relevant covariates.

weights_cen <- bayesweight_cen(
  trtmodel.list = list(
    A1 ~ L1_1 + L2_1,
    A2 ~ L1_2 + L2_2 + A1),
  cenmodel.list = list(
    C1 ~ L1_1 + L2_1 + A1,
    C2 ~ L1_2 + L2_2 + A2),
  data = simdat_cen,
  n.chains = 1,
  n.iter = 200,
  n.burnin = 100,
  n.thin = 1,
  seed = 890123,
  parallel = FALSE)
#> Compiling model graph
#>    Resolving undeclared variables
#>    Allocating nodes
#> Graph information:
#>    Observed stochastic nodes: 700
#>    Unobserved stochastic nodes: 21
#>    Total graph size: 2495
#> 
#> Initializing model

summary(weights_cen$weights)
#>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
#>  0.6291  1.0823  1.7307  3.1605  2.5831 49.4854      25

Similarly, the function will automatically run MCMC with JAGS based on the specified treatment and censoring model inputs, generating a JAGS model string as part of the function output. The function returns a list containing the updated weights for subject-specific treatment and censoring effects as well as the JAGS model.

cat(weights_cen$model_string)
#> model{
#> 
#> for (i in 1:N1) {
#> 
#> # conditional model;
#> A1[i] ~ dbern(p1[i])
#> logit(p1[i]) <- b10 + b11*L1_1[i] + b12*L2_1[i]
#> C1[i] ~ dbern(cp1[i])
#> logit(cp1[i]) <- s10 + s11*L1_1[i] + s12*L2_1[i] + s13*A1[i]
#> 
#> # marginal model;
#> A1s[i] ~ dbern(p1s[i])
#> logit(p1s[i]) <- bs10
#> C1s[i] ~ dbern(cp1s[i])
#> logit(cp1s[i]) <- ts10
#> }
#> 
#> for (i in 1:N2) {
#> 
#> # conditional model;
#> A2[i] ~ dbern(p2[i])
#> logit(p2[i]) <- b20 + b21*L1_2[i] + b22*L2_2[i] + b23*A1[i]
#> C2[i] ~ dbern(cp2[i])
#> logit(cp2[i]) <- s20 + s21*L1_2[i] + s22*L2_2[i] + s23*A2[i]
#> 
#> # marginal model;
#> A2s[i] ~ dbern(p2s[i])
#> logit(p2s[i]) <- bs20 + bs21*A1s[i]
#> C2s[i] ~ dbern(cp2s[i])
#> logit(cp2s[i]) <- ts20 + ts21*A1s[i]
#> }
#> 
#> # Priors
#> b10 ~ dunif(-10, 10)
#> b11 ~ dunif(-10, 10)
#> b12 ~ dunif(-10, 10)
#> s10 ~ dunif(-10, 10)
#> s11 ~ dunif(-10, 10)
#> s12 ~ dunif(-10, 10)
#> s13 ~ dunif(-10, 10)
#> bs10 ~ dunif(-10, 10)
#> ts10 ~ dunif(-10, 10)
#> b20 ~ dunif(-10, 10)
#> b21 ~ dunif(-10, 10)
#> b22 ~ dunif(-10, 10)
#> b23 ~ dunif(-10, 10)
#> s20 ~ dunif(-10, 10)
#> s21 ~ dunif(-10, 10)
#> s22 ~ dunif(-10, 10)
#> s23 ~ dunif(-10, 10)
#> bs20 ~ dunif(-10, 10)
#> bs21 ~ dunif(-10, 10)
#> ts20 ~ dunif(-10, 10)
#> ts21 ~ dunif(-10, 10)
#> }

Bayesian non-parametric bootstrap to maximize the utility function with respect to the causal effect using bayesmsm

Using the weights estimated by bayesweight_cen, we now fit the Bayesian Marginal Structural Model and estimate the marginal treatment effects using the bayesmsm function as before. We specify the outcome model and other relevant parameters.

# Remove all NAs (censored observations) from the original dataset and weights
simdat_cen$weights <- weights_cen$weights
simdat_cen2 <- na.omit(simdat_cen)

model <- bayesmsm(ymodel = Y ~ A1 + A2,
  nvisit = 2,
  reference = c(rep(0,2)),
  comparator = c(rep(1,2)),
  family = "binomial",
  data = simdat_cen2,
  wmean = simdat_cen2$weights,
  nboot = 50,
  optim_method = "BFGS",
  parallel = TRUE,
  seed = 890123,
  ncore = 2)
str(model)
#> List of 12
#>  $ RD_mean    : num 0.206
#>  $ RR_mean    : num 3.79
#>  $ OR_mean    : num 6.75
#>  $ RD_sd      : num 0.209
#>  $ RR_sd      : num 3.48
#>  $ OR_sd      : num 8.36
#>  $ RD_quantile: Named num [1:2] -0.174 0.565
#>   ..- attr(*, "names")= chr [1:2] "2.5%" "97.5%"
#>  $ RR_quantile: Named num [1:2] 0.547 13.219
#>   ..- attr(*, "names")= chr [1:2] "2.5%" "97.5%"
#>  $ OR_quantile: Named num [1:2] 0.435 26.76
#>   ..- attr(*, "names")= chr [1:2] "2.5%" "97.5%"
#>  $ bootdata   :'data.frame': 50 obs. of  5 variables:
#>   ..$ effect_reference : num [1:50] -1.02 -1.09 -1.24 -1.04 -2.12 ...
#>   ..$ effect_comparator: num [1:50] -1.136 -1.852 -0.389 -1.593 -0.857 ...
#>   ..$ RD               : num [1:50] -0.0223 -0.1158 0.1789 -0.0912 0.1906 ...
#>   ..$ RR               : num [1:50] 0.916 0.54 1.795 0.65 2.778 ...
#>   ..$ OR               : num [1:50] 0.889 0.467 2.334 0.578 3.532 ...
#>  $ reference  : num [1:2] 0 0
#>  $ comparator : num [1:2] 1 1

The bayesmsm function returns a model object containing the following: the mean, standard deviation, and 95% credible interval of the Risk Difference (RD), Risk Ratio (RR), and Odds Ratio (OR). It also includes a data frame containing the bootstrap samples for the reference effect, comparator effect, RD, RR, and OR, as well as the reference and comparator levels chosen by the user.

summary_bayesmsm(model)
#>                  mean        sd       2.5%      97.5%
#> Reference  -1.8573457 0.7393902 -3.3904462 -0.4320041
#> Comparator -0.6155699 0.6582818 -1.7697885  0.5451637
#> RD          0.2063189 0.2092313 -0.1738570  0.5648371
#> RR          3.7886208 3.4781279  0.5468277 13.2191157
#> OR          6.7493770 8.3562757  0.4349579 26.7600428

Visualization functions: plot_ATE, plot_APO, plot_est_box

Similarly, we can use the built-in functions as well as summary_bayesmsm to visualize and summarize the results.

plot_ATE(model)

plot_APO(model, effect_type = "effect_comparator")

plot_APO(model, effect_type = "effect_reference")

plot_est_box(model)

Reference

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.