Installing the latest developing version.
library(devtools)
install_github("NSAPH-Software/CRE", ref="develop")
library("CRE")
Installing from CRAN.
install.packages("CRE")
Input parameters:
y
The observed response (outcome)
vector.
z
The treatment (exposure, policy)
vector.
X
The covariate matrix.
method_params
Parameters for estimating
the individual treatment effect, including:
Parameters for Discovery
- ratio_dis
The ratio of data delegated to
the discovery sub-sample.
- ite_method_dis
The method to estimate
the discovery sub-sample in estimating individual treatment effect
(ITE).
- include_ps_dis
Whether or not to include
propensity score estimate as a covariate in discovery ITE estimation,
considered only for BART, or CF.
- ps_method_dis
The estimation model for
the propensity score on the discovery sub-sample.
- or_method_dis
The estimation model for
the outcome regressions estimate_ite_aipw on the discovery
sub-sample.
Parameters for Inference
- ite_method_inf
The method to estimate
the inference sample ITE.
- include_ps_inf
Whether or not to include
propensity score estimate as a covariate in inference ITE estimation,
considered only for BART, or CF.
- ps_method_inf
The estimation model for
the propensity score on the inference subsample.
- or_method_inf
The estimation model for
the outcome regressions in estimate_ite_aipw on the inference
subsample.
Other Parameters -
include_offset
Whether or not to include
an offset when estimating the ITE, for Poisson only.
- offset_name
The name of the offset, if
it is to be included.
- cate_method
The method to estimate the
conditional average treatment effect (CATE) values.
- cate_SL_library
The library used if
cate_method is set to DRLearner.
- filter_cate
Whether or not to filter
rules with p-value <= 0.05.
hyper_params
The list of parameters
required to tune the functions, including:
- effect_modifiers
Effect Modifiers for
Rules Generation.
- ntrees_rf
The number of decision trees
for randomForest.
- ntrees_gbm
The number of decision trees
for gradient boosting.
- node_size
The minimum size of the trees’
terminal nodes.
- max_nodes
The maximum number of terminal
nodes trees in the forest can have.
- max_depth
The number of top levels from
each tree considered to extract conditions.
- replace
Boolean variable for replacement
in bootstrapping.
- max_decay
Decay Threshold for pruning
the rules.
- type_decay
Decay Type for pruning the
rules (1: relative error; 2: error).
- t_anom
The threshold to define too
generic or too specific (anomalous) rules.
- t_corr
The threshold to define
correlated rules.
- q
Number of (unique) selected rules per
subsample in stability selection.
- stability_selection
Whether or not using
stability selection for selecting the causal rules.
- pfer_val
The Per-Family Error Rate, the
expected number of false discoveries.
(1) Options for the ITE estimation are as follows:
ipw
)sipw
)aipw
)oreg
)bart
)bcf
)cf
)poisson
)(2) The include_ps_dis
and
include_ps_inf
arguments will only be considered if the ITE
method selected is bart
, cf
.
The CRE package can generate synthetic data that can be used to test different features of the package. At the current implementation, the code can generate data with continuous or binary outcomes.
set.seed(9687)
<- generate_cre_dataset(n = 300, rho = 0, n_rules = 2, p = 10,
dataset_cont effect_size = 2, binary = FALSE)
<- dataset_cont[["y"]]
y <- dataset_cont[["z"]]
z <- as.data.frame(dataset_cont[["X"]])
X <- names(as.data.frame(X))
X_names
= list(ratio_dis = 0.25,
method_params ite_method_dis="bart",
ps_method_dis = "SL.xgboost",
oreg_method_dis = "SL.xgboost",
include_ps_dis = TRUE,
ite_method_inf = "bart",
ps_method_inf = "SL.xgboost",
oreg_method_inf = "SL.xgboost",
include_ps_inf = TRUE,
include_offset = FALSE,
cate_method = "DRLearner",
cate_SL_library = "SL.xgboost",
filter_cate = FALSE,
offset_name = NA,
random_state = 3591)
= list(ntrees_rf = 100,
hyper_params ntrees_gbm = 50,
node_size = 20,
max_nodes = 5,
max_depth = 15,
max_decay = 0.025,
type_decay = 2,
t_anom = 0.025,
t_corr = 1,
replace = FALSE,
q = 0.8,
stability_selection = TRUE,
pfer_val = 0.1)
<- cre(y, z, X, method_params, hyper_params) cre_results