The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

G-Computation or standardization for the Cox, Fine-Gray and binomial regression models for survival data

Klaus Holst & Thomas Scheike

2024-02-16

G-computation for the Cox and Fine-Gray models

Computing the standardized estimate (G-estimation) based on the Cox or Fine-Gray model : \[ \hat S(t,A=a) = n^{-1} \sum_i S(t,A=a,X_i) \] and this estimator has influence function \[ S(t,A=a,X_i) - S(t,A=a) + E( D_{A_0(t), \beta} S(t,A=a,X_i) ) \epsilon_i(t) \] where \(\epsilon_i(t)\) is the iid decomposition of \((\hat A(t) - A(t), \hat \beta- \beta)\).

These estimates have a causal interpration under the assumption of no-unmeasured confounders, and even without the causal assumptions this standardization can still be a useful summary measure.

set.seed(100)

data(bmt); bmt$time <- bmt$time+runif(nrow(bmt))*0.001
dfactor(bmt) <- tcell~tcell
bmt$event <- (bmt$cause!=0)*1

fg1 <- cifreg(Event(time,cause)~tcell+platelet+age,bmt,cause=1,
          cox.prep=TRUE,propodds=NULL)
summary(survivalG(fg1,bmt,50))
#> risk:
#>       Estimate Std.Err   2.5%  97.5%   P-value
#> risk0   0.4331 0.02749 0.3793 0.4870 6.321e-56
#> risk1   0.2727 0.05863 0.1577 0.3876 3.313e-06
#> 
#> Average Treatment effects (G-estimator) :
#>     Estimate Std.Err   2.5%    97.5% P-value
#> ps0  -0.1605 0.06353 -0.285 -0.03597 0.01153
#> 
#> Average Treatment effect ratio (G-estimator) :
#>        Estimate  Std.Err      2.5%     97.5%    P-value
#> [ps0] 0.6295004 0.139248 0.3565794 0.9024214 0.00779742

fg2 <- cifreg(Event(time,cause)~tcell+platelet+age,bmt,cause=2,
          cox.prep=TRUE,propodds=NULL)
summary(survivalG(fg2,bmt,50))
#> risk:
#>       Estimate Std.Err   2.5%  97.5%   P-value
#> risk0   0.2127 0.02314 0.1674 0.2581 3.757e-20
#> risk1   0.3336 0.06799 0.2003 0.4668 9.281e-07
#> 
#> Average Treatment effects (G-estimator) :
#>     Estimate Std.Err     2.5%  97.5% P-value
#> ps0   0.1208 0.07189 -0.02009 0.2617 0.09285
#> 
#> Average Treatment effect ratio (G-estimator) :
#>       Estimate   Std.Err      2.5%    97.5%   P-value
#> [ps0] 1.567915 0.3627528 0.8569321 2.278897 0.1174496

ss <- phreg(Surv(time,event)~tcell+platelet+age,bmt)
summary(survivalG(ss,bmt,50))
#> risk:
#>       Estimate Std.Err   2.5%  97.5%    P-value
#> risk0   0.6539 0.02709 0.6008 0.7070 9.218e-129
#> risk1   0.5640 0.05971 0.4470 0.6811  3.531e-21
#> 
#> Average Treatment effects (G-estimator) :
#>     Estimate Std.Err    2.5%   97.5% P-value
#> ps0 -0.08992  0.0629 -0.2132 0.03337  0.1529
#> 
#> Average Treatment effect ratio (G-estimator) :
#>        Estimate    Std.Err      2.5%    97.5%   P-value
#> [ps0] 0.8624974 0.09446477 0.6773499 1.047645 0.1455042

G-computation for the binomial regression

We compare with the similar estimates using the Doubly Robust estimating equations using binregATE. The standardization from the G-computation can also be computed using a specialized function that takes less memory and is quicker (for large data).


## survival situation
sr1 <- binregATE(Event(time,event)~tcell+platelet+age,bmt,cause=1,
         time=40, treat.model=tcell~platelet+age)
summary(sr1)
#> 
#>    n events
#>  408    241
#> 
#>  408 clusters
#> coeffients:
#>              Estimate   Std.Err      2.5%     97.5% P-value
#> (Intercept)  0.679693  0.138551  0.408138  0.951248  0.0000
#> tcell1      -0.032018  0.353415 -0.724698  0.660662  0.9278
#> platelet    -0.504940  0.248245 -0.991492 -0.018387  0.0419
#> age          0.315033  0.117786  0.084178  0.545889  0.0075
#> 
#> exp(coeffients):
#>             Estimate    2.5%  97.5%
#> (Intercept)  1.97327 1.50401 2.5889
#> tcell1       0.96849 0.48447 1.9361
#> platelet     0.60354 0.37102 0.9818
#> age          1.37030 1.08782 1.7261
#> 
#> Average Treatment effects (G-formula) :
#>             Estimate    Std.Err       2.5%      97.5% P-value
#> treat0     0.6233534  0.0274214  0.5696085  0.6770983   0.000
#> treat1     0.6161006  0.0748225  0.4694512  0.7627499   0.000
#> treat:1-0 -0.0072528  0.0802736 -0.1645862  0.1500805   0.928
#> 
#> Average Treatment effects (double robust) :
#>            Estimate   Std.Err      2.5%     97.5% P-value
#> treat0     0.623341  0.027505  0.569433  0.677249   0.000
#> treat1     0.645159  0.085872  0.476853  0.813465   0.000
#> treat:1-0  0.021818  0.090254 -0.155076  0.198711   0.809

## relative risk effect 
estimate(coef=sr1$riskDR,vcov=sr1$var.riskDR,f=function(p) p[2]/p[1],null=1)
#>          Estimate Std.Err   2.5% 97.5% P-value
#> [treat1]    1.035  0.1453 0.7503  1.32  0.8096
#> 
#>  Null Hypothesis: 
#>   [treat1] = 1

## competing risks 
br1 <- binregATE(Event(time,cause)~tcell+platelet+age,bmt,cause=1,
         time=40,treat.model=tcell~platelet+age)
summary(br1)
#> 
#>    n events
#>  408    157
#> 
#>  408 clusters
#> coeffients:
#>              Estimate   Std.Err      2.5%     97.5% P-value
#> (Intercept) -0.188365  0.130922 -0.444967  0.068237  0.1502
#> tcell1      -0.715361  0.352473 -1.406195 -0.024527  0.0424
#> platelet    -0.537310  0.244804 -1.017117 -0.057502  0.0282
#> age          0.417814  0.107282  0.207545  0.628084  0.0001
#> 
#> exp(coeffients):
#>             Estimate    2.5%  97.5%
#> (Intercept)  0.82831 0.64085 1.0706
#> tcell1       0.48902 0.24507 0.9758
#> platelet     0.58432 0.36164 0.9441
#> age          1.51864 1.23065 1.8740
#> 
#> Average Treatment effects (G-formula) :
#>            Estimate   Std.Err      2.5%     97.5% P-value
#> treat0     0.417795  0.027029  0.364819  0.470771  0.0000
#> treat1     0.266393  0.062041  0.144795  0.387991  0.0000
#> treat:1-0 -0.151402  0.067763 -0.284214 -0.018589  0.0255
#> 
#> Average Treatment effects (double robust) :
#>            Estimate   Std.Err      2.5%     97.5% P-value
#> treat0     0.417337  0.027120  0.364184  0.470491  0.0000
#> treat1     0.231224  0.060718  0.112218  0.350229  0.0001
#> treat:1-0 -0.186114  0.066117 -0.315700 -0.056527  0.0049

and using the specialized function

br1 <- binreg(Event(time,cause)~tcell+platelet+age,bmt,cause=1,time=40)
Gbr1 <- binregG(br1,data=bmt)
summary(Gbr1)
#> risk:
#>       Estimate Std.Err   2.5%  97.5%   P-value
#> risk0   0.4178 0.02703 0.3648 0.4708 6.752e-54
#> risk1   0.2664 0.06204 0.1448 0.3880 1.756e-05
#> 
#> Average Treatment effects (G-estimator) :
#>    Estimate Std.Err    2.5%    97.5% P-value
#> p1  -0.1514 0.06776 -0.2842 -0.01859 0.02546
#> 
#> Average Treatment effect ratio (G-estimator) :
#>       Estimate   Std.Err      2.5%     97.5%    P-value
#> [p1] 0.6376167 0.1542628 0.3352673 0.9399661 0.01881733

## contrasting average age to +2-sd age, Avalues
Gbr2 <- binregG(br1,data=bmt,varname="age",Avalues=c(0,2))
summary(Gbr2)
#> risk:
#>       Estimate Std.Err   2.5%  97.5%   P-value
#> risk0   0.3935 0.02529 0.3439 0.4431 1.432e-54
#> risk2   0.5929 0.05580 0.4836 0.7023 2.261e-26
#> 
#> Average Treatment effects (G-estimator) :
#>    Estimate Std.Err   2.5%  97.5%   P-value
#> p1   0.1994 0.05019 0.1011 0.2978 7.069e-05
#> 
#> Average Treatment effect ratio (G-estimator) :
#>      Estimate  Std.Err     2.5%    97.5%     P-value
#> [p1] 1.506855 0.132196 1.247756 1.765954 0.000126016

SessionInfo

sessionInfo()
#> R version 4.3.2 (2023-10-31)
#> Platform: aarch64-apple-darwin22.6.0 (64-bit)
#> Running under: macOS Sonoma 14.3.1
#> 
#> Matrix products: default
#> BLAS:   /Users/kkzh/.asdf/installs/R/4.3.2/lib/R/lib/libRblas.dylib 
#> LAPACK: /Users/kkzh/.asdf/installs/R/4.3.2/lib/R/lib/libRlapack.dylib;  LAPACK version 3.11.0
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> time zone: Europe/Copenhagen
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] splines   stats     graphics  grDevices utils     datasets  methods  
#> [8] base     
#> 
#> other attached packages:
#> [1] ggplot2_3.4.4  cowplot_1.1.1  mets_1.3.4     timereg_2.0.5  survival_3.5-7
#> 
#> loaded via a namespace (and not attached):
#>  [1] sass_0.4.7          utf8_1.2.4          future_1.33.1      
#>  [4] generics_0.1.3      lattice_0.22-5      listenv_0.9.1      
#>  [7] digest_0.6.34       magrittr_2.0.3      evaluate_0.23      
#> [10] grid_4.3.2          mvtnorm_1.2-4       fastmap_1.1.1      
#> [13] jsonlite_1.8.8      Matrix_1.6-5        fansi_1.0.6        
#> [16] scales_1.2.1        isoband_0.2.7       codetools_0.2-19   
#> [19] numDeriv_2016.8-1.1 jquerylib_0.1.4     lava_1.7.4         
#> [22] cli_3.6.2           rlang_1.1.3         parallelly_1.37.0  
#> [25] future.apply_1.11.1 munsell_0.5.0       withr_3.0.0        
#> [28] cachem_1.0.8        yaml_2.3.7          tools_4.3.2        
#> [31] parallel_4.3.2      ucminf_1.2.0        dplyr_1.1.3        
#> [34] colorspace_2.1-0    globals_0.16.2      vctrs_0.6.5        
#> [37] R6_2.5.1            lifecycle_1.0.4     MASS_7.3-60        
#> [40] pkgconfig_2.0.3     bslib_0.5.1         pillar_1.9.0       
#> [43] gtable_0.3.4        glue_1.7.0          Rcpp_1.0.12        
#> [46] tidyselect_1.2.0    xfun_0.41           tibble_3.2.1       
#> [49] highr_0.10          knitr_1.45          farver_2.1.1       
#> [52] htmltools_0.5.6.1   labeling_0.4.3      rmarkdown_2.25     
#> [55] compiler_4.3.2

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.