The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
policyFX
with clusteredinterference
First, load the clusteredinterference
package
library(clusteredinterference)
Now load a quick data example that’s included in the package
data("toy_data")
Estimation is implemented with the policyFX()
function:
suppressWarnings(RNGversion("3.5.0")) ## For backwards compatibility
set.seed(1113)
causal_fx <- policyFX(
data = toy_data,
formula = Outcome | Treatment ~ Age + Distance + (1 | Cluster_ID) | Cluster_ID,
alphas = c(.3, .5),
k_samps = 1
)
The policyFX()
function outputs a "policyFX"
object, which works well with a few methods, including:
summary(causal_fx)
#> ------------- causal estimates --------------
#> estimand estimate se LCI UCI
#> mu(0.3) 0.6523 0.0635 0.5279 0.7767
#> mu(0.5) 0.6075 0.0505 0.5086 0.7063
#> mu0(0.3) 0.6707 0.0736 0.5264 0.8151
#> mu0(0.5) 0.5849 0.0728 0.4422 0.7276
#> mu1(0.3) 0.2799 0.0563 0.1695 0.3902
#> mu1(0.5) 0.3974 0.0577 0.2842 0.5105
#> OE(0.5,0.3) -0.0449 0.0366 -0.1165 0.0268
#> OE(0.3,0.5) 0.0449 0.0366 -0.0268 0.1165
#>
#> ... and 4 more rows ...
#>
#> -------------- treatment model -------------
#> Generalized linear mixed model fit by maximum likelihood (Adaptive
#> Gauss-Hermite Quadrature, nAGQ = 2) [glmerMod]
#> Family: binomial ( logit )
#> Formula: Treatment ~ Age + Distance + (1 | Cluster_ID)
#> Data: data
#> AIC BIC logLik deviance df.resid
#> 137.0345 147.3743 -64.5172 129.0345 94
#> Random effects:
#> Groups Name Std.Dev.
#> Cluster_ID (Intercept) 1.18
#> Number of obs: 98, groups: Cluster_ID, 30
#> Fixed Effects:
#> (Intercept) Age Distance
#> -1.44609 -0.00851 0.26097
#>
#> ------------- propensity scores -------------
#> 1 2 3 4 5 6 7 8 9 10
#> 0.105 0.162 0.086 0.102 0.167 0.045 0.244 0.0934 0.0765 0.197
#> 11 12 13 14 15 16 17 18 19 20
#> 0.0653 0.281 0.104 0.365 0.0867 0.198 0.207 0.106 0.0847 0.134
#> 21 22 23 24 25 26 27 28 29 30
#> 0.103 0.111 0.105 0.302 0.0434 0.0943 0.0443 0.0512 0.13 0.263
#> ---------------------------------------------
data
A data.frame
. At present, tibble
s are coerced back to standard data.frame
s. I also recommend against using factor
s in the columns.
alphas
A numeric vector of probabilities corresponding the the policies of interest. Each must be between 0 and 1.
k_samps
The user must specify the number of sum-sampled vectors for estimating the counterfactual probabilities (nuisance parameters). It is recommended to choose k_samps <=5
. To avoid the sub-sampling approximation and use all possible vectors, set k_samps=0
.
formula
The formula may be the trickiest, and it has plenty of information. It provides:
outcome | treatment ~ predictors and random intercept | clustering specification
Note that the middle section is passed to glmer()
to fit the mixed effects model, so this is how to specify the modeling formula.
Treatment ~ Age + Distance + (1 | Cluster_ID)
See below for the model output.
verbose = FALSE
root_options = NULL
This is for rootSolve::multiroot()
used in the point estimation procedure. E.g., this will be passed in:
root_options = list(atol = 1e-7)
nAGQ=2
This is for lme4::glmer()
. The default in glmer()
is nAGQ=1
, which indicates a Laplace approximation to the log-likelihood. Instead, in this package the default is nAGQ=2
, which indicates that n=2
Adaptive Gaussian Quadrature points will be used. This is slightly slower but is a more accurate calculation. In limited testing, it seems that nAGQ=2
was almost as accurate as higher values, so 2 was chosen as the default. See their documentation for more details.
return_matrices = FALSE
If TRUE
, this will return the “bread” and “meat” matrices in the variance calculations. The default is FALSE
as these matrices can be quite large.
In the event you’re only interested in a subset of contrasts, you can pass a customized grid into the function.
my_grid <- makeTargetGrid(alphas = (3:8)/20, small_grid = TRUE)
head(my_grid)
#> alpha1_num alpha2_num trt alpha1 alpha2 estimand effect_type estVar
#> 1 1 NA NA 0.15 NA mu outcome TRUE
#> 2 2 NA NA 0.20 NA mu outcome TRUE
#> 3 3 NA NA 0.25 NA mu outcome TRUE
#> 4 4 NA NA 0.30 NA mu outcome TRUE
#> 5 5 NA NA 0.35 NA mu outcome TRUE
#> 6 6 NA NA 0.40 NA mu outcome TRUE
This can be particularly useful for plotting, as you can “turn off” the variance estimates
my_grid$estVar <- FALSE
This is available through the dots argument. Note that when supplying a custom target_grid
, it’s not necessary to specify the alphas
argument, as that is taken directly from target_grid
.
causal_fx2 <- policyFX(
data = toy_data,
formula = Outcome | Treatment ~ Age + Distance + (1 | Cluster_ID) | Cluster_ID,
# alphas = c(.3, .5),
target_grid = my_grid,
k_samps = 5,
verbose = FALSE,
root_options = list(atol=1e-4)
)
print(causal_fx, nrows = 9)
#> ------------- causal estimates --------------
#> estimand estimate se LCI UCI
#> mu(0.3) 0.6523 0.0635 0.5279 0.77667
#> mu(0.5) 0.6075 0.0505 0.5086 0.70633
#> mu0(0.3) 0.6707 0.0736 0.5264 0.81506
#> mu0(0.5) 0.5849 0.0728 0.4422 0.72760
#> mu1(0.3) 0.2799 0.0563 0.1695 0.39020
#> mu1(0.5) 0.3974 0.0577 0.2842 0.51055
#> OE(0.5,0.3) -0.0449 0.0366 -0.1165 0.02681
#> OE(0.3,0.5) 0.0449 0.0366 -0.0268 0.11651
#> SE0(0.5,0.3) -0.0858 0.0403 -0.1648 -0.00677
#> ... and 3 more rows ...
#> ---------------------------------------------
The tidy dataframe estimates
can be easily used for plotting:
plotdat <- causal_fx2$estimates[causal_fx2$estimates$estimand_type=="mu",]
plot(x = plotdat$alpha1, y = plotdat$estimate, main = "Estimated Population Means")
As mentioned above, the treatment model is specified via the formula
argument. For example, compare:
# Returns the specified formula, coerced to a Formula object
causal_fx$formula
#> Outcome | Treatment ~ Age + Distance + (1 | Cluster_ID) | Cluster_ID
# causal_fx$model is a glmerMod S4 object
causal_fx$model@call
#> lme4::glmer(formula = Treatment ~ Age + Distance + (1 | Cluster_ID),
#> data = data, family = stats::binomial, nAGQ = nAGQ)
lme4::getME(causal_fx$model, c("beta", "theta"))
#> $beta
#> [1] -1.446087049 -0.008509771 0.260968952
#>
#> $theta
#> Cluster_ID.(Intercept)
#> 1.180325
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.