The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
This vignette showcase how policy_learn()
and
policy_eval()
can be combined to estimate and evaluate the
optimal subgroup in the single-stage case. We refer to (Nordland and Holst 2023) for the syntax and
methodological context.
From here on we consider the single-stage case with a binary action set \(\{0,1\}\). For a given threshold \(\eta > 0\) we can formulate the optimal subgroup function via the conditional average treatment effect (CATE/blip) as
\[\begin{align*} d^\eta_0(v)_ = I\{B_0(v) > \eta\}, \end{align*}\] where \(B_0\) is the CATE defined as \[\begin{align*} E\left[U^{(1)} - U^{(0)} \big | V = v \right]. \end{align*}\]
The average treatment effect in the optimal subgroup is now defined as \[\begin{align*} E\left[U^{(1)} - U^{(0)} \big | d^\eta_0(V) = 1 \right], \end{align*}\]
which under consistency, positivity and randomization is identified as
\[\begin{align*} E\left[Z(1,g_0,Q_0)(O) - Z(0,g_0,Q_0)(O) \big | d^\eta_0(V) = 1 \right], \end{align*}\]
where \(Z(a,g,Q)(O)\) is the doubly robust score for treatment \(a\) and
\[\begin{align*} d^\eta_0(v) &= I\{B_0(v) > \eta\}\\ B_0(v) &= E\left[Z(1,g_0,Q_0)(O) - Z(0,g_0,Q_0)(O) \big | V = v \right] \end{align*}\]
In polle
the threshold policy \(d_\eta\) can be estimated using
policy_learn()
via the threshold
argument, and
the average treatment effect in the subgroup can be estimated using
policy_eval()
setting target = subgroup
.
Here we consider an example using simulated data:
par0 <- list(a = 1, b = 0, c = 3)
sim_d <- function(n, par=par0, potential_outcomes = FALSE) {
W <- runif(n = n, min = -1, max = 1)
L <- runif(n = n, min = -1, max = 1)
A <- rbinom(n = n, size = 1, prob = 0.5)
U1 <- W + L + (par$c*W + par$a*L + par$b) # U^1
U0 <- W + L # U^0
U <- A * U1 + (1 - A) * U0 + rnorm(n = n)
out <- data.table(W = W, L = L, A = A, U = U)
if (potential_outcomes == TRUE) {
out$U0 <- U0
out$U1 <- U1
}
return(out)
}
Note that in this simple case \(U^{(1)} - U^{(0)} = cW + aL + b\).
set.seed(1)
d <- sim_d(n = 200)
pd <- policy_data(
d,
action = "A",
covariates = list("W", "L"),
utility = "U"
)
We set a correctly specified policy learner using
policy_learn()
with type = "blip"
and set a
threshold of \(\eta = 1\):
pl1 <- policy_learn(
type = "blip",
control = control_blip(blip_models = q_glm(~ W + L)),
threshold = 1
)
When then apply the policy learner based on the correctly specified nuisance models. Furthermore, we extract the corresponding policy actions, where \(d_N(Z,L) = 1\) identifies the optimal subgroup for \(\eta = 1\):
po1 <- pl1(
policy_data = pd,
g_models = g_glm(~ 1),
q_models = q_glm(~ A * (W + L))
)
pf1 <- get_policy(po1)
pa <- pf1(pd)
In the following plot, the black line indicates the boundary for the true optimal subgroup. The dots represent the estimated threshold policy:
Similarly, we can also use type = "ptl"
to fit a policy
tree with a given threshold for not choosing the reference action (first
action in action set in alphabetical order)
## [1] "0"
pl1_ptl <- policy_learn(
type = "ptl",
control = control_ptl(policy_var = c("W", "L")),
threshold = 1
)
## Loading required namespace: policytree
po1_ptl <- pl1_ptl(
policy_data = pd,
g_models = g_glm(~ 1),
q_models = q_glm(~ A * (W + L))
)
po1_ptl$ptl_objects
## $stage_1
## $stage_1$threshold_1
## policy_tree object
## Tree depth: 2
## Actions: 1: 0 2: 1
## Variable splits:
## (1) split_variable: W split_value: -0.0948583
## (2) split_variable: W split_value: -0.107529
## (4) * action: 1
## (5) * action: 2
## (3) split_variable: W split_value: 0.197522
## (6) * action: 1
## (7) * action: 2
The true subgroup average treatment effect is given by:
\[\begin{align*} E[cW + aL + b | cW + aL + b \geq \eta ], \end{align*}\]
which we can easily approximate:
set.seed(1)
approx <- sim_d(n = 1e7, potential_outcomes = TRUE)
(sate <- with(approx, mean((U1 - U0)[(U1 - U0 >= 1)])))
## [1] 2.082982
The subgroup average treatment effect associated with the learned
optimal threshold policy can be directly estimated using
policy_eval()
via the target
argument:
## Estimate Std.Err 2.5% 97.5% P-value
## E[Z(1)-Z(0)|d=1]: d=blip(eta=1) 1.941 0.2614 1.428 2.453 1.136e-13
We can also estimate the subgroup average treatment effect for a set of thresholds at once:
pl_set <- policy_learn(
type = "blip",
control = control_blip(blip_models = q_glm(~ W + L)),
threshold = c(0, 1)
)
policy_eval(
policy_data = pd,
g_models = g_glm(~ 1),
q_models = q_glm(~ A * (W + L)),
policy_learn = pl_set,
target = "subgroup"
)
## Estimate Std.Err 2.5% 97.5% P-value
## E[Z(1)-Z(0)|d=1]: d=blip(eta=0) 1.641 0.2161 1.217 2.064 3.118e-14
## E[Z(1)-Z(0)|d=1]: d=blip(eta=1) 1.935 0.2612 1.423 2.447 1.268e-13
The data adaptive target parameter
\[\begin{align*} E[U^{(1)} - U^{(0)}| d_N(V) = 1] = E[Z_0(1,g,Q)(O) - Z_0(0,g,Q)(O)| d_N(V) = 1] \end{align*}\]
is asymptotically normal with influence function
\[\begin{align*} \frac{1}{P(d'(\cdot) = 1)} I\{d'(\cdot) = 1\}\left\{Z(1,g,Q)(O) - Z(0,g,Q)(O) - E[Z(1,g,Q)(O) - Z(0,g,Q)(O) | d'(\cdot) = 1]\right\}, \end{align*}\]
where \(d'\) is the limiting
policy of \(d_N\). The fitted influence
curve can be extracted using IC()
:
## [,1]
## [1,] 0.000000
## [2,] 0.000000
## [3,] 0.000000
## [4,] -1.780405
## [5,] 0.000000
## [6,] 9.520253
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.