The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Two-Stage Randomization for Cox Type rate models

Klaus Holst & Thomas Scheike

2025-01-11

Two-Stage Randomization for counting process outcomes

Specify rate models of \(N_1(t)\).

Under simple randomization we can estimate the rate Cox model

Under two-stage randomization we can estimate the rate Cox model

Starting point is that Cox’s partial likelihood score can be used for estimating parameters \[\begin{align*} U(\beta) & = \int (A(t) - e(t)) dN_1(t) \end{align*}\] where \(A(t)\) is the combined treatments over time.

The estimator can be agumented in different ways using additional covariates at the time of randomization and a censoring augmentation. The solved estimating eqution is \[\begin{align*} \sum_i U_i - AUG_0 - AUG_1 + AUG_C = 0 \end{align*}\] using the covariates from augmentR0 to augment with \[\begin{align*} AUG_0 = ( A_0 - \pi_0(X_0) ) X_0 \gamma_0 \end{align*}\] where possibly \(P(A_0=1|X_0)=\pi_0(X_0)\) but does not depend on covariates under randomization, and furhter using the covariates from augmentR1, to augment with R indiciating that the randomization takes place or not, \[\begin{align*} AUG_1 = R ( A_1 - \pi_1(X_1)) X_1 \gamma_1 \end{align*}\] and the dynamic censoring augmenting
\[\begin{align*} AUG_C = \int_0^t \gamma_c(s)^T (e(s) - \bar e(s)) \frac{1}{G_c(s) } dM_c(s) \end{align*}\] where \(\gamma_c(s)\) is chosen to minimize the variance given the dynamic covariates specified by augmentC.

The propensity score models are always estimated unless it is requested to use some fixed number \(\pi_0=1/2\) for example, but always better to be adaptive and estimate \(\pi_0\). Also \(\gamma_0\) and \(\gamma_1\) are estimated to reduce variance of \(U_i\).

Standard errors are estimated using the influence function of all estimators and tests of differences can therefore be computed subsequently.

The times of randomization is specified by

Data must be given on start,stop,status survival format with

The phreg_rct can be used for counting process style data, and thus covers situations with

and will in all cases compute augmentations

Simple Randomization: Lu-Tsiatis marginal Cox model

library(mets) 
set.seed(100)

## Lu, Tsiatis simulation
data <- mets:::simLT(0.7,100)
dfactor(data) <- Z.f~Z
 
out <- phreg_rct(Surv(time,status)~Z.f,data=data,augmentR0=~X,augmentC=~factor(Z):X)
summary(out)
#>                 Estimate   Std.Err       2.5%     97.5%   P-value
#> Marginal-Z.f1 0.29263400 0.2739159 -0.2442313 0.8294993 0.2853693
#> R0_C:Z.f1     0.07166242 0.2234066 -0.3662065 0.5095313 0.7483838
#> R0_dynC:Z.f1  0.08321604 0.2221710 -0.3522312 0.5186633 0.7079889
#> attr(,"class")
#> [1] "summary.phreg_rct"
###out <- phreg_rct(Surv(time,status)~Z.f,data=data,augmentR0=~X,augmentC=~X)
###out <- phreg_rct(Surv(time,status)~Z.f,data=data,augmentR0=~X,augmentC=~factor(Z):X,cens.model=~+1)

Results consitent with speff of library(speff2trial)

###library(speff2trial) 
library(mets)
data(ACTG175)
###
data <- ACTG175[ACTG175$arms==0 | ACTG175$arms==1, ]
data <- na.omit(data[,c("days","cens","arms","strat","cd40","cd80","age")])
data$days <- data$days+runif(nrow(data))*0.01
dfactor(data) <- arms.f~arms
notrun <- 1

if (notrun==0) { 
fit1 <- speffSurv(Surv(days,cens)~cd40+cd80+age,data=data,trt.id="arms",fixed=TRUE)
summary(fit1)
}
# 
# Treatment effect
#             Log HR       SE   LowerCI   UpperCI           p
# Prop Haz  -0.70375  0.12352  -0.94584  -0.46165  1.2162e-08
# Speff     -0.72430  0.12051  -0.96050  -0.48810  1.8533e-09

out <- phreg_rct(Surv(days,cens)~arms.f,data=data,augmentR0=~cd40+cd80+age,augmentC=~cd40+cd80+age)
summary(out)
#>                    Estimate   Std.Err       2.5%      97.5%      P-value
#> Marginal-arms.f1 -0.7036460 0.1224406 -0.9436251 -0.4636669 9.092786e-09
#> R0_C:arms.f1     -0.7265342 0.1197607 -0.9612610 -0.4918075 1.306891e-09
#> R0_dynC:arms.f1  -0.7204699 0.1196158 -0.9549125 -0.4860272 1.710025e-09
#> attr(,"class")
#> [1] "summary.phreg_rct"

The study is actually block-randomized according (?) so the standard should be computed with an adjustment that is equivalent to augmenting with this block as factor

dtable(data,~strat+arms)
#> 
#>       arms   0   1
#> strat             
#> 1          223 213
#> 2           96 106
#> 3          213 203
dfactor(data) <- strat.f~strat
out <- phreg_rct(Surv(days,cens)~arms.f,data=data,augmentR0=~strat.f)
summary(out)
#>                    Estimate   Std.Err       2.5%      97.5%      P-value
#> Marginal-arms.f1 -0.7036460 0.1224406 -0.9436251 -0.4636669 9.092786e-09
#> R0_none:arms.f1  -0.7009844 0.1217138 -0.9395390 -0.4624298 8.447051e-09
#> attr(,"class")
#> [1] "summary.phreg_rct"

Recurrent events: Simple Randomization

Recurrents events simulation with death and censoring.

n <- 1000
beta <- 0.15; 
data(base1cumhaz)
data(base4cumhaz)
data(drcumhaz)
dr <- scalecumhaz(drcumhaz,1)
base1 <- scalecumhaz(base1cumhaz,1)
base4 <- scalecumhaz(base4cumhaz,0.5)
cens <- rbind(c(0,0),c(2000,0.5),c(5110,3))
ce <- 3; betao1 <- 0

varz <- 1; dep=4; X <- z <- rgamma(n,1/varz)*varz
Z0 <- NULL
px <- 0.5
if (betao1!=0) px <- lava::expit(betao1*X)      
A0 <- rbinom(n,1,px)
r1 <- exp(A0*beta[1])
rd <- exp( A0 * 0.15)
rc <- exp( A0 * 0 )
###
rr <-    mets:::simLUCox(n,base1,death.cumhaz=dr,r1=r1,Z0=X,dependence=dep,var.z=varz,cens=ce/5000)
rr$A0 <- A0[rr$id]
rr$z1 <- attr(rr,"z")[rr$id]
rr$lz1 <- log(rr$z1)
rr$X <- rr$lz1 
rr$lX <- rr$z1
rr$statusD <- rr$status
rr <- dtransform(rr,statusD=2,death==1)
rr <- count.history(rr)
rr$Z <- rr$A0
data <- rr
data$Z.f <- as.factor(data$Z)
data$treattime <- 0
data <- dtransform(data,treattime=1,lbnr__id==1)
dlist(data,start+stop+statusD+A0+z1+treattime+Count1~id|id %in% c(4,5))
#> id: 4
#>      start   stop    statusD A0 z1    treattime Count1
#> 4      0.000   9.565 1       0  0.471 1         0     
#> 1003   9.565 372.057 1       0  0.471 0         1     
#> 1468 372.057 389.831 0       0  0.471 0         2     
#> ------------------------------------------------------------ 
#> id: 5
#>   start stop  statusD A0 z1    treattime Count1
#> 5 0     213.9 2       1  2.338 1         0

Now we fit the model

fit2 <- phreg_rct(Event(start,stop,statusD)~Z.f+cluster(id),data=data,
     treat.var="treattime",typesR=c("non","R0"),typesC=c("non","C","dynC"),
     augmentR0=~z1,augmentC=~z1+Count1)
summary(fit2)
#>                Estimate    Std.Err       2.5%     97.5%      P-value
#> Marginal-Z.f1 0.2870649 0.09632565 0.09827011 0.4758597 0.0028810700
#> non_C:Z.f1    0.2826049 0.09631924 0.09382262 0.4713871 0.0033457707
#> non_dynC:Z.f1 0.1926888 0.08864883 0.01894025 0.3664373 0.0297337758
#> R0_non:Z.f1   0.3110880 0.08049844 0.15331399 0.4688621 0.0001113067
#> R0_C:Z.f1     0.3066141 0.08049078 0.14885504 0.4643731 0.0001393570
#> R0_dynC:Z.f1  0.2164684 0.07113356 0.07704922 0.3558877 0.0023413376
#> attr(,"class")
#> [1] "summary.phreg_rct"

Twostage Randomization: Recurrent events

n <- 500
beta=c(0.3,0.3);betatr=0.3;betac=0;betao=0;betao1=0;ce=3;fixed=1;sim=1;dep=4;varz=1;ztr=0; ce <- 3
## take possible frailty 
Z0 <- rgamma(n,1/varz)*varz
px0 <- 0.5; if (betao!=0) px0 <- expit(betao*Z0)
A0 <- rbinom(n,1,px0)
r1 <- exp(A0*beta[1])
#
px1 <- 0.5; if (betao1!=0) px1 <- expit(betao1*Z0)
A1 <- rbinom(n,1,px1)
r2 <- exp(A1*beta[2])
rtr <- exp(A0*betatr[1])
rr <-  mets:::simLUCox(n,base1,death.cumhaz=dr,cumhaz2=base1,rtr=rtr,betatr=0.3,A0=A0,Z0=Z0,
        r1=r1,r2=r2,dependence=dep,var.z=varz,cens=ce/5000,ztr=ztr)
rr$z1 <- attr(rr,"z")[rr$id]
rr$A1 <- A1[rr$id]
rr$A0 <- A0[rr$id]
rr$lz1 <- log(rr$z1)
rr <- count.history(rr)
rr$A1t <- 0
rr <- dtransform(rr,A1t=A1,Count2==1) 
rr$At.f <- rr$A0
rr$A0.f <- factor(rr$A0)
rr$A1.f <- factor(rr$A1)
rr <- dtransform(rr, At.f = A1, Count2 == 1)
rr$At.f <- factor(rr$At.f)
dfactor(rr)  <-  A0.f~A0
rr$treattime <- 0
rr <- dtransform(rr,treattime=1,lbnr__id==1)
rr$lagCount2 <- dlag(rr$Count2)
rr <- dtransform(rr,treattime=1,Count2==1 & (Count2!=lagCount2))
dlist(rr,start+stop+statusD+A0+A1+A1t+At.f+Count2+z1+treattime+Count1~id|id %in% c(5,10))
#> id: 5
#>   start stop  statusD A0 A1 A1t At.f Count2 z1     treattime Count1
#> 5 0     132.3 3       1  1  0   1    0      0.2316 1         0     
#> ------------------------------------------------------------ 
#> id: 10
#>     start stop    statusD A0 A1 A1t At.f Count2 z1      treattime Count1
#> 10   0.00   33.12 2       1  0  0   1    0      0.06891 1         0     
#> 509 33.12 1363.53 0       1  0  0   0    1      0.06891 1         0

Now fitting the model and computing different augmentations (true values 0.3 and 0.3)

sse <- phreg_rct(Event(start,time,statusD)~A0.f+A1t+cluster(id),data=rr,
     typesR=c("non","R0","R1","R01"),typesC=c("non","C","dynC"),treat.var="treattime",
     treat.model=At.f~factor(Count2),
     augmentR0=~z1,augmentR1=~z1,augmentC=~z1+Count1+A1t)
summary(sse)
#>                 Estimate   Std.Err        2.5%     97.5%      P-value
#> Marginal-A0.f1 0.3179631 0.1418023  0.04003574 0.5958904 0.0249420566
#> Marginal-A1t   0.3290147 0.1472363  0.04043683 0.6175925 0.0254434247
#> non_C:A0.f1    0.3002782 0.1391490  0.02755115 0.5730053 0.0309308283
#> non_C:A1t      0.4151190 0.1405664  0.13961382 0.6906241 0.0031451130
#> non_dynC:A0.f1 0.3104992 0.1314476  0.05286660 0.5681318 0.0181692114
#> non_dynC:A1t   0.4374223 0.1300991  0.18243265 0.6924119 0.0007731772
#> R0_non:A0.f1   0.4142867 0.1176773  0.18364338 0.6449300 0.0004306830
#> R0_non:A1t     0.3382185 0.1470378  0.05002979 0.6264072 0.0214360247
#> R0_C:A0.f1     0.3962505 0.1144662  0.17190089 0.6206002 0.0005367253
#> R0_C:A1t       0.4242165 0.1403584  0.14911904 0.6993140 0.0025079567
#> R0_dynC:A0.f1  0.4066624 0.1049692  0.20092652 0.6123983 0.0001070147
#> R0_dynC:A1t    0.4464941 0.1298744  0.19194497 0.7010432 0.0005862621
#> R1_non:A0.f1   0.3269008 0.1416813  0.04921043 0.6045911 0.0210383383
#> R1_non:A1t     0.2104421 0.1254571 -0.03544923 0.4563335 0.0934636201
#> R1_C:A0.f1     0.3092275 0.1390258  0.03674199 0.5817130 0.0261319083
#> R1_C:A1t       0.2976811 0.1175580  0.06727171 0.5280905 0.0113347106
#> R1_dynC:A0.f1  0.3194771 0.1313171  0.06210024 0.5768540 0.0149798139
#> R1_dynC:A1t    0.3202318 0.1048176  0.11479297 0.5256706 0.0022496121
#> R01_non:A0.f1  0.4092668 0.1176428  0.17869115 0.6398424 0.0005034879
#> R01_non:A1t    0.2280764 0.1243165 -0.01557936 0.4717322 0.0665584775
#> R01_C:A0.f1    0.3912859 0.1144307  0.16700576 0.6155659 0.0006275647
#> R01_C:A1t      0.3150865 0.1163399  0.08706445 0.5431086 0.0067623432
#> R01_dynC:A0.f1 0.4016977 0.1049305  0.19603760 0.6073577 0.0001290708
#> R01_dynC:A1t   0.3375831 0.1034497  0.13482542 0.5403408 0.0011013908
#> attr(,"class")
#> [1] "summary.phreg_rct"

Causal assumptions for Twostage Randomization: Recurrent events

We take interest in \(N_1\) but also have death \(N_d\).

Now we need that given \(X_0\)

and given \(\bar X_1\) the history accumulated at time \(T_R\) of 2nd randomization

and

to link the counterfactual quantities to observed data.

We must use IPTW weighted Cox score and augment as before

In addition we need that the censoring is independent given for example \(A_0\)

To use the phreg_rct in this situation

fit2 <- phreg_rct(Event(start,stop,statusD)~Z.f+cluster(id),data=data,
     treat.var="treattime",typesR=c("non","R0"),typesC=c("non","C","dynC"),
     RCT=FALSE,treat.model=Z.f~z1,augmentR0=~z1,augmentC=~z1+Count1)
summary(fit2)
#>                Estimate    Std.Err       2.5%     97.5%      P-value
#> Marginal-Z.f1 0.3111195 0.08058494 0.15317593 0.4690631 0.0001130326
#> non_C:Z.f1    0.3067348 0.08057748 0.14880581 0.4646637 0.0001408301
#> non_dynC:Z.f1 0.2169383 0.07119837 0.07739205 0.3564845 0.0023117164
#> R0_non:Z.f1   0.3111195 0.08058494 0.15317593 0.4690631 0.0001130326
#> R0_C:Z.f1     0.3067348 0.08057748 0.14880581 0.4646637 0.0001408301
#> R0_dynC:Z.f1  0.2169383 0.07119837 0.07739205 0.3564845 0.0023117164
#> attr(,"class")
#> [1] "summary.phreg_rct"

and for twostage randomization

sse <- phreg_rct(Event(start,time,statusD)~A0.f+A1t+cluster(id),data=rr,
     typesR=c("non","R0","R1","R01"),typesC=c("non","C","dynC"),
     treat.var="treattime",
     RCT=FALSE, treat.model=At.f~z1*factor(Count2),
     augmentR0=~z1,augmentR1=~z1,augmentC=~z1+Count1+A1t)
summary(sse)
#>                 Estimate    Std.Err        2.5%     97.5%      P-value
#> Marginal-A0.f1 0.3817476 0.13188068  0.12326622 0.6402290 0.0037958891
#> Marginal-A1t   0.2259319 0.12344157 -0.01600910 0.4678730 0.0672089296
#> non_C:A0.f1    0.3765255 0.12594711  0.12967369 0.6233773 0.0027938653
#> non_C:A1t      0.3187675 0.11256529  0.09814361 0.5393914 0.0046280182
#> non_dynC:A0.f1 0.3797091 0.11214290  0.15991306 0.5995051 0.0007093493
#> non_dynC:A1t   0.3514300 0.09297578  0.16920079 0.5336591 0.0001569535
#> R0_non:A0.f1   0.3817476 0.13188068  0.12326622 0.6402290 0.0037958891
#> R0_non:A1t     0.2259319 0.12344157 -0.01600910 0.4678730 0.0672089296
#> R0_C:A0.f1     0.3765255 0.12594711  0.12967369 0.6233773 0.0027938653
#> R0_C:A1t       0.3187675 0.11256529  0.09814361 0.5393914 0.0046280182
#> R0_dynC:A0.f1  0.3797091 0.11214290  0.15991306 0.5995051 0.0007093493
#> R0_dynC:A1t    0.3514300 0.09297578  0.16920079 0.5336591 0.0001569535
#> R1_non:A0.f1   0.3817476 0.13188068  0.12326622 0.6402290 0.0037958891
#> R1_non:A1t     0.2259319 0.12344157 -0.01600910 0.4678730 0.0672089296
#> R1_C:A0.f1     0.3765255 0.12594711  0.12967369 0.6233773 0.0027938653
#> R1_C:A1t       0.3187675 0.11256529  0.09814361 0.5393914 0.0046280182
#> R1_dynC:A0.f1  0.3797091 0.11214290  0.15991306 0.5995051 0.0007093493
#> R1_dynC:A1t    0.3514300 0.09297578  0.16920080 0.5336591 0.0001569535
#> R01_non:A0.f1  0.3817476 0.13185641  0.12331378 0.6401814 0.0037894525
#> R01_non:A1t    0.2259319 0.12307282 -0.01528637 0.4671502 0.0663934395
#> R01_C:A0.f1    0.3765255 0.12592170  0.12972349 0.6233275 0.0027883528
#> R01_C:A1t      0.3187675 0.11216079  0.09893641 0.5385986 0.0044823275
#> R01_dynC:A0.f1 0.3797091 0.11211436  0.15996900 0.5994492 0.0007071245
#> R01_dynC:A1t   0.3514300 0.09248564  0.17016144 0.5326985 0.0001447938
#> attr(,"class")
#> [1] "summary.phreg_rct"

SessionInfo

sessionInfo()
#> R version 4.4.2 (2024-10-31)
#> Platform: aarch64-apple-darwin24.2.0
#> Running under: macOS Sequoia 15.2
#> 
#> Matrix products: default
#> BLAS:   /Users/klaus/.asdf/installs/R/4.4.2/lib/R/lib/libRblas.dylib 
#> LAPACK: /Users/klaus/.asdf/installs/R/4.4.2/lib/R/lib/libRlapack.dylib;  LAPACK version 3.12.0
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> time zone: Europe/Copenhagen
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] mets_1.3.5     timereg_2.0.6  survival_3.8-3
#> 
#> loaded via a namespace (and not attached):
#>  [1] cli_3.6.3           knitr_1.49          rlang_1.1.4        
#>  [4] xfun_0.50           jsonlite_1.8.9      listenv_0.9.1      
#>  [7] future.apply_1.11.3 lava_1.8.1          htmltools_0.5.8.1  
#> [10] sass_0.4.9          rmarkdown_2.29      grid_4.4.2         
#> [13] evaluate_1.0.1      jquerylib_0.1.4     fastmap_1.2.0      
#> [16] mvtnorm_1.3-2       yaml_2.3.10         lifecycle_1.0.4    
#> [19] numDeriv_2016.8-1.1 compiler_4.4.2      codetools_0.2-20   
#> [22] ucminf_1.2.2        Rcpp_1.0.13-1       future_1.34.0      
#> [25] lattice_0.22-6      digest_0.6.37       R6_2.5.1           
#> [28] parallelly_1.41.0   parallel_4.4.2      splines_4.4.2      
#> [31] bslib_0.8.0         Matrix_1.7-1        tools_4.4.2        
#> [34] globals_0.16.3      cachem_1.1.0

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.