The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
This article is a brief introduction to civ
.
library(civ)
library(AER)
set.seed(517938)
To illustrate civ
on a simple example, consider the data
generating process from the simulation of Wiemann (2023). The code
snippet below draws a sample of size \(n=800\).
# Set seed
set.seed(51944)
# Sample parameters
= 800 # sample size
nobs = 0.858 # first stage coefficient
C = sqrt(0.81) # first stage error
sgm_V <- c(-0.5, 0.5) + 1 # second stage effects
tau_X # Sample controls and instrument
<- sample(1:2, nobs, replace = T)
X <- model.matrix(~ 0 + as.factor(sample(1:20, nobs, replace = T)):as.factor(X))
Z <- Z %*% c(1:ncol(Z))
Z # Create the low-dimensional latent instrument
<- Z %% 2 # underlying latent instrument
Z0 # Draw first and second stage errors
<- matrix(rnorm(2 * nobs, 0, 1), nobs, 2) %*%
U_V chol(matrix(c(1, 0.6, 0.6, sgm_V), 2, 2))
# Draw treatment and outcome variables
<- Z0 * C + U_V[, 2]
D <- D * tau_X[X] + U_V[, 1] y
In the generated sample, the observed instrument takes 40 values with
varying numbers of observations per instrument. Using only the observed
instrument Z
, the goal is to estimate the in-sample average
treatment effect:
mean(tau_X[X])
## [1] 1.0325
The code snippet below estimates CIV where the first stage is
restricted to K=2
support points. The AER
package is used to compute heteroskedasticity robust standard
errors.
# Compute CIV with K=2 and conduct inference
<- civ(y = y, D = D, Z = Z, X = as.factor(X), K = 2)
civ_fit <- summary(civ_fit, vcov = vcovHC(civ_fit$iv_fit, type = "HC1")) civ_res
The CIV estimate and the corresponding standard error are shown below. The associated 95% confidence interval covers the true effect as indicated by the t-value of less than 1.96.
c(Estimate = civ_res$coef[2, 1], "Std. Error" = civ_res$coef[2, 2],
"t-val." = abs(civ_res$coef[2, 1]-mean(tau_X[X]))/civ_res$coef[2, 2])
## Estimate Std. Error t-val.
## 1.0063143 0.1086868 0.2409285
CIV uses a K-Conditional-Means (KCMeans) estimator in a first step to
estimate the optimal instrument. To understand the estimated mapping of
observed instruments to the support points of the latent instrument, it
is useful to print the cluster_map
attribute of the
first-stage kcmeans_fit
object (see also kcmeans
for
details). The code snippet below prints the results for the first 10
values of the instrument. Here, x
denotes the value of the
observed instrument while cluster_x
denotes the association
with the estimated optimal instrument.
t(head(civ_fit$kcmeans_fit$cluster_map[, c(1, 4)], 10))
## [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## x 26 20 10 32 23 12 7 25 33 21
## cluster_x 1 1 1 1 2 1 2 2 2 2
Wiemann T (2023). “Optimal Categorical Instruments.” https://arxiv.org/abs/2311.17021
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.