The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

G-Computation or standardization for the Cox, Fine-Gray and binomial regression models for survival data

Klaus Holst & Thomas Scheike

2025-04-23

G-computation for the Cox and Fine-Gray models

Computing the standardized estimate (G-estimation) based on the Cox or Fine-Gray model : \[ \hat S(t,A=a) = n^{-1} \sum_i S(t,A=a,X_i) \] and this estimator has influence function \[ S(t,A=a,X_i) - S(t,A=a) + E( D_{A_0(t), \beta} S(t,A=a,X_i) ) \epsilon_i(t) \] where \(\epsilon_i(t)\) is the iid decomposition of \((\hat A(t) - A(t), \hat \beta- \beta)\).

These estimates have a causal interpration under the assumption of no-unmeasured confounders, and even without the causal assumptions this standardization can still be a useful summary measure.

First looking cumulative incidence via the Fine-Gray model for the two causes and making a plot of the standardized cumulative incidence for cause 1.

set.seed(100)

data(bmt); bmt$time <- bmt$time+runif(nrow(bmt))*0.001
dfactor(bmt) <- tcell~tcell
bmt$event <- (bmt$cause!=0)*1

fg1 <- cifreg(Event(time,cause)~tcell+platelet+age,bmt,cause=1,propodds=NULL)
summary(survivalG(fg1,bmt,50))
#> risk:
#>       Estimate Std.Err   2.5%  97.5%   P-value
#> risk0   0.4331 0.02749 0.3793 0.4870 6.321e-56
#> risk1   0.2727 0.05863 0.1577 0.3876 3.313e-06
#> 
#> Average Treatment effects (G-estimator) :
#>     Estimate Std.Err   2.5%    97.5% P-value
#> ps0  -0.1605 0.06353 -0.285 -0.03597 0.01153
#> 
#> Average Treatment effect risk-ratio (G-estimator) :
#>        Estimate  Std.Err      2.5%     97.5%    P-value
#> [ps0] 0.6295004 0.139248 0.3565794 0.9024214 0.00779742
#> 
#> Average Treatment effect (1-risk=survival)-ratio (G-estimator) :
#> NULL

fg2 <- cifreg(Event(time,cause)~tcell+platelet+age,bmt,cause=2,propodds=NULL)
summary(survivalG(fg2,bmt,50))
#> risk:
#>       Estimate Std.Err   2.5%  97.5%   P-value
#> risk0   0.2127 0.02314 0.1674 0.2581 3.757e-20
#> risk1   0.3336 0.06799 0.2003 0.4668 9.281e-07
#> 
#> Average Treatment effects (G-estimator) :
#>     Estimate Std.Err     2.5%  97.5% P-value
#> ps0   0.1208 0.07189 -0.02009 0.2617 0.09285
#> 
#> Average Treatment effect risk-ratio (G-estimator) :
#>       Estimate   Std.Err      2.5%    97.5%   P-value
#> [ps0] 1.567915 0.3627528 0.8569321 2.278897 0.1174496
#> 
#> Average Treatment effect (1-risk=survival)-ratio (G-estimator) :
#> NULL

cif1time <- survivalGtime(fg1,bmt)
plot(cif1time,type="risk");

Now looking at the survival probability

ss <- phreg(Surv(time,event)~tcell+platelet+age,bmt)
sss <- survivalG(ss,bmt,50)
summary(sss)
#> risk:
#>       Estimate Std.Err   2.5%  97.5%    P-value
#> risk0   0.6539 0.02709 0.6008 0.7070 9.218e-129
#> risk1   0.5640 0.05971 0.4470 0.6811  3.531e-21
#> 
#> Average Treatment effects (G-estimator) :
#>     Estimate Std.Err    2.5%   97.5% P-value
#> ps0 -0.08992  0.0629 -0.2132 0.03337  0.1529
#> 
#> Average Treatment effect risk-ratio (G-estimator) :
#>        Estimate    Std.Err      2.5%    97.5%   P-value
#> [ps0] 0.8624974 0.09446477 0.6773499 1.047645 0.1455042
#> 
#> Average Treatment effect (1-risk=survival)-ratio (G-estimator) :
#>       Estimate   Std.Err      2.5%    97.5%   P-value
#> [ps0] 1.259836 0.1894627 0.8884963 1.631176 0.1702385

Gtime <- survivalGtime(ss,bmt)
plot(Gtime)

G-computation for the binomial regression

We compare with the similar estimates using the Doubly Robust estimating equations using binregATE. The standardization from the G-computation can also be computed using a specialized function that takes less memory and is quicker (for large data).


## survival situation
sr1 <- binregATE(Event(time,event)~tcell+platelet+age,bmt,cause=1,
         time=40, treat.model=tcell~platelet+age)
summary(sr1)
#> 
#>    n events
#>  408    241
#> 
#>  408 clusters
#> coeffients:
#>              Estimate   Std.Err      2.5%     97.5% P-value
#> (Intercept)  0.676409  0.137007  0.407880  0.944939  0.0000
#> tcell1      -0.023675  0.346994 -0.703770  0.656420  0.9456
#> platelet    -0.492952  0.246158 -0.975412 -0.010492  0.0452
#> age          0.343939  0.115561  0.117444  0.570434  0.0029
#> 
#> exp(coeffients):
#>             Estimate    2.5%  97.5%
#> (Intercept)  1.96680 1.50363 2.5727
#> tcell1       0.97660 0.49472 1.9279
#> platelet     0.61082 0.37704 0.9896
#> age          1.41049 1.12462 1.7690
#> 
#> Average Treatment effects (G-formula) :
#>             Estimate    Std.Err       2.5%      97.5% P-value
#> treat0     0.6230976  0.0273827  0.5694284  0.6767667  0.0000
#> treat1     0.6177595  0.0731712  0.4743466  0.7611723  0.0000
#> treat:1-0 -0.0053381  0.0783973 -0.1589940  0.1483179  0.9457
#> 
#> Average Treatment effects (double robust) :
#>            Estimate   Std.Err      2.5%     97.5% P-value
#> treat0     0.623337  0.027508  0.569422  0.677253  0.0000
#> treat1     0.644397  0.085942  0.475954  0.812840  0.0000
#> treat:1-0  0.021059  0.090305 -0.155935  0.198054  0.8156

## relative risk effect 
estimate(coef=sr1$riskDR,vcov=sr1$var.riskDR,f=function(p) p[2]/p[1],null=1)
#>          Estimate Std.Err   2.5% 97.5% P-value
#> [treat1]    1.034  0.1453 0.7489 1.319  0.8162
#> 
#>  Null Hypothesis: 
#>   [treat1] = 1

## competing risks 
br1 <- binregATE(Event(time,cause)~tcell+platelet+age,bmt,cause=1,
         time=40,treat.model=tcell~platelet+age)
summary(br1)
#> 
#>    n events
#>  408    157
#> 
#>  408 clusters
#> coeffients:
#>              Estimate   Std.Err      2.5%     97.5% P-value
#> (Intercept) -0.191519  0.130883 -0.448044  0.065007  0.1434
#> tcell1      -0.712880  0.351489 -1.401786 -0.023974  0.0425
#> platelet    -0.531919  0.244495 -1.011119 -0.052718  0.0296
#> age          0.432939  0.107314  0.222607  0.643271  0.0001
#> 
#> exp(coeffients):
#>             Estimate    2.5%  97.5%
#> (Intercept)  0.82570 0.63888 1.0672
#> tcell1       0.49023 0.24616 0.9763
#> platelet     0.58748 0.36381 0.9486
#> age          1.54178 1.24933 1.9027
#> 
#> Average Treatment effects (G-formula) :
#>            Estimate   Std.Err      2.5%     97.5% P-value
#> treat0     0.417746  0.027030  0.364768  0.470724  0.0000
#> treat1     0.267097  0.061849  0.145874  0.388319  0.0000
#> treat:1-0 -0.150649  0.067578 -0.283100 -0.018199  0.0258
#> 
#> Average Treatment effects (double robust) :
#>            Estimate   Std.Err      2.5%     97.5% P-value
#> treat0     0.417320  0.027122  0.364163  0.470478  0.0000
#> treat1     0.231149  0.060651  0.112275  0.350023  0.0001
#> treat:1-0 -0.186171  0.066053 -0.315633 -0.056710  0.0048

and using the specialized function

br1 <- binreg(Event(time,cause)~tcell+platelet+age,bmt,cause=1,time=40)
Gbr1 <- binregG(br1,data=bmt)
summary(Gbr1)
#> risk:
#>       Estimate Std.Err   2.5%  97.5%   P-value
#> risk0   0.4177 0.02727 0.3643 0.4712 5.588e-53
#> risk1   0.2671 0.06183 0.1459 0.3883 1.562e-05
#> 
#> Average Treatment effects (G-estimator) :
#>    Estimate Std.Err    2.5%    97.5% P-value
#> p1  -0.1506 0.06759 -0.2831 -0.01817 0.02583
#> 
#> Average Treatment effect risk-ratio (G-estimator) :
#>       Estimate   Std.Err      2.5%     97.5%    P-value
#> [p1] 0.6393758 0.1538101 0.3379136 0.9408381 0.01904716
#> 
#> Average Treatment effect (1-risk=survival)-ratio (G-estimator) :
#> NULL

## contrasting average age to +2-sd age, Avalues
Gbr2 <- binregG(br1,data=bmt,varname="age",Avalues=c(0,2))
summary(Gbr2)
#> risk:
#>       Estimate Std.Err   2.5%  97.5%   P-value
#> risk0   0.3932 0.02537 0.3434 0.4429 3.566e-54
#> risk2   0.5997 0.05544 0.4911 0.7084 2.874e-27
#> 
#> Average Treatment effects (G-estimator) :
#>    Estimate Std.Err   2.5%  97.5%   P-value
#> p1   0.2066 0.04998 0.1086 0.3045 3.584e-05
#> 
#> Average Treatment effect risk-ratio (G-estimator) :
#>      Estimate   Std.Err     2.5%    97.5%      P-value
#> [p1] 1.525375 0.1324356 1.265806 1.784945 7.277532e-05
#> 
#> Average Treatment effect (1-risk=survival)-ratio (G-estimator) :
#> NULL

SessionInfo

sessionInfo()
#> R version 4.4.3 (2025-02-28)
#> Platform: aarch64-apple-darwin24.3.0
#> Running under: macOS Sequoia 15.4.1
#> 
#> Matrix products: default
#> BLAS:   /Users/kkzh/.asdf/installs/R/4.4.3/lib/R/lib/libRblas.dylib 
#> LAPACK: /Users/kkzh/.asdf/installs/R/4.4.3/lib/R/lib/libRlapack.dylib;  LAPACK version 3.12.0
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> time zone: Europe/Copenhagen
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] splines   stats     graphics  grDevices utils     datasets  methods  
#> [8] base     
#> 
#> other attached packages:
#> [1] ggplot2_3.5.1  cowplot_1.1.3  timereg_2.0.6  survival_3.8-3 mets_1.3.6    
#> 
#> loaded via a namespace (and not attached):
#>  [1] sass_0.4.9          future_1.40.0       generics_0.1.3     
#>  [4] lattice_0.22-6      listenv_0.9.1       digest_0.6.37      
#>  [7] magrittr_2.0.3      evaluate_1.0.3      grid_4.4.3         
#> [10] mvtnorm_1.3-3       fastmap_1.2.0       jsonlite_1.9.1     
#> [13] Matrix_1.7-2        scales_1.3.0        isoband_0.2.7      
#> [16] codetools_0.2-20    numDeriv_2016.8-1.1 jquerylib_0.1.4    
#> [19] lava_1.8.1          cli_3.6.4           rlang_1.1.5        
#> [22] parallelly_1.43.0   future.apply_1.11.3 munsell_0.5.1      
#> [25] withr_3.0.2         cachem_1.1.0        yaml_2.3.10        
#> [28] tools_4.4.3         parallel_4.4.3      ucminf_1.2.2       
#> [31] dplyr_1.1.4         colorspace_2.1-1    globals_0.17.0     
#> [34] vctrs_0.6.5         R6_2.6.1            lifecycle_1.0.4    
#> [37] MASS_7.3-65         pkgconfig_2.0.3     bslib_0.9.0        
#> [40] pillar_1.10.1       gtable_0.3.6        glue_1.8.0         
#> [43] Rcpp_1.0.14         xfun_0.51           tibble_3.2.1       
#> [46] tidyselect_1.2.1    knitr_1.49          farver_2.1.2       
#> [49] htmltools_0.5.8.1   rmarkdown_2.29      labeling_0.4.3     
#> [52] compiler_4.4.3

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.