The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Using the survregVB Package

Overview of survregVB

The survregVB function provides a fast and accessible solution for variational inference in accelerated failure time (AFT) models for right-censored survival times following a log-logistic distribution. It provides an efficient alternative to Markov chain Monte Carlo (MCMC) methods by implementing a mean-field variational Bayes (VB) algorithm for parameter estimation. The VB approach employs a coordinate ascent algorithm and incorporates a piecewise approximation technique when computing expectations to achieve conjugacy (Xian et al., 2024b). Eddelbuettel et al. (2025)

The AFT Model

The log-logistic AFT model without shared frailty is specified as follows for the \(i^{th}\) subject in the sample, \(i=1,...,n\) , \(T_i\):

\[ log(T_i):=Y=X_i^T\beta+bz_i \]

where \(X_i\) is column vector of \(p-1\) covariates and a constant one (i.e. \(X_i=(1,x_i1,...,x_i(p-1))^T\)), \(\beta\) is a vector of coefficients for the covariates, \(z_i\) is a random variable following a standard logistic distribution with scale parameter \(b\).

The survregVB function uses a Bayesian framework to obtain the optimal variational densities of parameters \(\beta\) and \(b\) by maximizing the evidence based lower bound (ELBO). To do so, we assume prior distributions:

where \(\mu_0,v_0,\alpha_0\) and \(\omega_0\) are known hyperparameters. At the end of the model fitting process, survregVB obtains the approximated posterior distributions:

where the parameters \(\mu,\Sigma,\alpha\) and \(\omega\) are obtained via the VB algorithm (Xian et al., 2024b).

The AFT Model With Shared Frailty

We can also use the survregVB function is to fit it a shared frailty log-logistic AFT regression model that accounts for intra-cluster correlation through a cluster-specific random intercept. For time \(T_{ij}\) of the \(j_{th}\) subject from the \(i_{th}\) cluster in the sample, in a sample with \(i=1,...,K\) clusters and \(j=1,...,n_i\) subjects:

\[ \log(T_{ij})=\gamma_i+X_{ij}^T\beta+b\epsilon_{ij} \]

where \(X_{ij}\) is column vector of \(p-1\) covariates and a constant one (i.e. \(X_{ij}=(1,x_{ij1},...,x_{ij(p-1)})^T\)), \(\beta\) is a vector of coefficients, \(\gamma_i\) is a random intercept for the \(i^{th}\) cluster, \(\epsilon_{ij}\) is a variable following a standard logistic distribution with scale parameter \(b\).

In addition to parameters \(\beta\) and \(b\), survregVB obtains the optimal variational densities of parameters \(\sigma^2_\gamma\) (the frailty variance) and \(\gamma_i\). In addition to \(\beta\) and \(b\), we assume prior distributions:

  • \(\gamma_i|\sigma^2_\gamma\mathop{\sim}\limits^{\mathrm{iid}}N(0,\sigma^2_\gamma)\), and

  • \(\sigma^2\sim\text{Inverse-Gamma}(\lambda_0,\eta_0)\),

where \(\mu_0,v_0,\alpha_0\), \(\omega_0\), \(\lambda_0\) and \(\eta_0\) are known hyperparameters. At the end of the model fitting process, survregVBobtains the approximated posterior distribution,

  • \(q^*(\gamma_i)\), a \(N(\tau^*_i,\sigma^{2*}_i)\) density function, and

  • \(q^*(\sigma^2_\gamma)\), an \(\text{Inverse-Gamma}(\lambda^*,\eta^*)\) density function,

where the parameters \(\mu,\Sigma,\alpha,\omega,\tau_i, \sigma_i, \lambda\) and \(\eta\) are obtained via the VB algorithm (Xian et al., 2024a).

Getting Started using survregVB

First, we load the survregVB and survival libraries.

library(survregVB)
library(survival)

Fitting the Model

For the dnase data set included in the package, our goal is to fit it a log-logistic AFT regression model of the form:

\[ \log(T):=Y=\beta_0+\beta_1x_1+\beta_2x_2+bz \] where trt (\(x_1\), treatment, binary) and fev (\(x_2\), forced expiratory volume, continuous) are the covariates of interest, and the right-censoring indicator is infect (Xian et al., 2024b).

The following fits the model with priors based off previous studies:

fit <- survregVB(
  formula = Surv(time, infect) ~ trt + fev,
  data = dnase,
  alpha_0 = 501,
  omega_0 = 500,
  mu_0 = c(4.4, 0.25, 0.04),
  v_0 = 1,
  max_iteration = 10000,
  threshold = 0.0005,
  na.action = na.omit
)
print(fit)
## Call:
## survregVB(formula = Surv(time, infect) ~ trt + fev, data = dnase, 
##     alpha_0 = 501, omega_0 = 500, mu_0 = c(4.4, 0.25, 0.04), 
##     v_0 = 1, na.action = na.omit, max_iteration = 10000, threshold = 5e-04)
## 
## Posterior distributions of the regression coefficients (Beta):
## mu=
## (Intercept)         trt         fev 
##      4.1124      0.4155      0.0213 
## 
## Sigma=
##             (Intercept)   trt     fev
## (Intercept)    0.036204 -0.01 < 2e-16
## trt           -0.010274  0.02 2.2e-05
## fev           -0.000473  0.00 8.3e-06
## 
## Posterior distribution of the scale parameter (b):
## alpha=  744   omega=  674.648 
## 
## ELBO=  -4857.094 
## 
## Number of iterations=  9 
## 
## n= 645
summary(fit)
## Call:
## survregVB(formula = Surv(time, infect) ~ trt + fev, data = dnase, 
##     alpha_0 = 501, omega_0 = 500, mu_0 = c(4.4, 0.25, 0.04), 
##     v_0 = 1, na.action = na.omit, max_iteration = 10000, threshold = 5e-04)
##             Value    SD 95% CI Lower 95% CI Upper
## (Intercept) 4.112 0.190        3.739        4.485
## trt         0.415 0.141        0.139        0.692
## fev         0.021 0.003        0.016        0.027
## scale       0.908 0.033        0.844        0.974
## 
## ELBO=  -4857.094 
## 
## Number of iterations=  9 
## 
## n= 645

Fitting a Model with Shared Frailty

We will fit the simulation_frailty data set included in the package to a log-logistic AFT regression model with shared frailty. For the \(j^{th}\) subject in the \(i^{th}\) cluster, \(i=1,...,K\) and \(j=1,...,n_i\):

\[ \log(T_i):=Y_i=0.5+\beta_1x_{1i}+\beta_2x_{2i}+\gamma_i+b\epsilon_i \]

The following fits the model with non-informative priors (Xian et al., 2024a):

fit_frailty <- survregVB(
  formula = Surv(Time.15, delta.15) ~ x1 + x2,
  data = simulation_frailty,
  alpha_0 = 3,
  omega_0 = 2,
  mu_0 = c(0, 0, 0),
  v_0 = 0.1,
  lambda_0 = 3,
  eta_0 = 2,
  cluster = cluster,
  max_iteration = 100,
  threshold = 0.01
)
print(fit_frailty)
## Call:
## survregVB(formula = Surv(Time.15, delta.15) ~ x1 + x2, data = simulation_frailty, 
##     alpha_0 = 3, omega_0 = 2, mu_0 = c(0, 0, 0), v_0 = 0.1, lambda_0 = 3, 
##     eta_0 = 2, cluster = cluster, max_iteration = 100, threshold = 0.01)
## 
## Posterior distributions of the regression coefficients (Beta):
## mu=
## (Intercept)          x1          x2 
##      -0.392       0.897       0.547 
## 
## Sigma=
##             (Intercept)    x1     x2
## (Intercept)      0.4765 -0.43 <2e-16
## x1              -0.4325  0.42 0.0062
## x2              -0.0398  0.01 0.0612
## 
## Posterior distribution of the scale parameter (b):
## alpha=  68   omega=  44.52189 
## 
## Posterior distribution of the random intercept (sigma_gamma squared):
## lambda=  10.5   eta=  10.25464 
## 
## Posterior distributions of the random effects for each cluster (gamma):
## tau=
##      1      2      3      4      5      6      7      8      9     10     11 
## -0.202  0.668  0.453  0.130  1.659  0.307  0.550  1.973 -1.121 -0.762 -0.519 
##     12     13     14     15 
## -0.662 -1.596 -0.812  0.393 
## 
## sigma=
##     1     2     3     4     5     6     7     8     9    10    11    12    13 
## 0.217 0.170 0.156 0.181 0.205 0.181 0.201 0.209 0.181 0.156 0.156 0.276 0.156 
##    14    15 
## 0.184 0.221 
## 
## ELBO=  -260 
## 
## Number of iterations=  12 
## 
## n= 75
summary(fit_frailty)
## Call:
## survregVB(formula = Surv(Time.15, delta.15) ~ x1 + x2, data = simulation_frailty, 
##     alpha_0 = 3, omega_0 = 2, mu_0 = c(0, 0, 0), v_0 = 0.1, lambda_0 = 3, 
##     eta_0 = 2, cluster = cluster, max_iteration = 100, threshold = 0.01)
##              Value     SD 95% CI Lower 95% CI Upper
## (Intercept) -0.392  0.690       -1.745        0.961
## x1           0.897  0.650       -0.376        2.171
## x2           0.547  0.247        0.062        1.032
## scale        0.665  0.082        0.516        0.831
## frailty      1.079  0.376        0.501        1.804
## 
## ELBO=  -260 
## 
## Number of iterations=  12 
## 
## n= 75

Session info

The following package and versions were used in the production of this vignette.

## R version 4.4.2 (2024-10-31)
## Platform: aarch64-apple-darwin20
## Running under: macOS Big Sur 11.6
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
## LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0
## 
## locale:
## [1] C/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## time zone: America/Toronto
## tzcode source: internal
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
## [1] survival_3.8-3  survregVB_0.0.1 knitr_1.49     
## 
## loaded via a namespace (and not attached):
##  [1] digest_0.6.37     R6_2.5.1          invgamma_1.1      fastmap_1.2.0    
##  [5] Matrix_1.7-2      xfun_0.50         lattice_0.22-6    splines_4.4.2    
##  [9] cachem_1.1.0      htmltools_0.5.8.1 rmarkdown_2.29    lifecycle_1.0.4  
## [13] cli_3.6.3         bayestestR_0.15.2 grid_4.4.2        sass_0.4.9       
## [17] jquerylib_0.1.4   compiler_4.4.2    rstudioapi_0.17.1 tools_4.4.2      
## [21] evaluate_1.0.3    bslib_0.9.0       yaml_2.3.10       rlang_1.1.5      
## [25] jsonlite_1.8.9    insight_1.0.2

References

Eddelbuettel, D., Francois, R., Allaire, J., Ushey, K., Kou, Q., Russell, N., Ucar, I., Bates, D., & Chambers, J. (2025). Rcpp: Seamless r and c++ integration. https://CRAN.R-project.org/package=Rcpp
Xian, C., Souza, C. P. E. de, He, W., Rodrigues, F. F., & Tian, R. (2024a). Fast variational bayesian inference for correlated survival data: An application to invasive mechanical ventilation duration analysis. https://doi.org/10.48550/ARXIV.2408.00177
Xian, C., Souza, C. P. E. de, He, W., Rodrigues, F. F., & Tian, R. (2024b). Variational Bayesian analysis of survival data using a log-logistic accelerated failure time model. Statistics and Computing, 34(2). https://doi.org/10.1007/s11222-023-10365-6

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.