The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Get Started

This article is a brief introduction to civ.

library(civ)
library(AER)
set.seed(517938)

To illustrate civ on a simple example, consider the data generating process from the simulation of Wiemann (2023). The code snippet below draws a sample of size \(n=800\).

# Set seed
set.seed(51944)
# Sample parameters
nobs = 800 # sample size
C = 0.858 # first stage coefficient
sgm_V = sqrt(0.81) # first stage error
tau_X <- c(-0.5, 0.5) + 1 # second stage effects
# Sample controls and instrument
X <- sample(1:2, nobs, replace = T)
Z <- model.matrix(~ 0 + as.factor(sample(1:20, nobs, replace = T)):as.factor(X))
Z <- Z %*% c(1:ncol(Z))
# Create the low-dimensional latent instrument
Z0 <- Z %% 2 # underlying latent instrument
# Draw first and second stage errors
U_V <- matrix(rnorm(2 * nobs, 0, 1), nobs, 2) %*%
  chol(matrix(c(1, 0.6, 0.6, sgm_V), 2, 2))
# Draw treatment and outcome variables
D <- Z0 * C + U_V[, 2]
y <- D * tau_X[X] + U_V[, 1]

In the generated sample, the observed instrument takes 40 values with varying numbers of observations per instrument. Using only the observed instrument Z, the goal is to estimate the in-sample average treatment effect:

mean(tau_X[X])
## [1] 1.0325

The code snippet below estimates CIV where the first stage is restricted to K=2 support points. The AER package is used to compute heteroskedasticity robust standard errors.

# Compute CIV with K=2 and conduct inference
civ_fit <- civ(y = y, D = D, Z = Z, X = as.factor(X), K = 2)
civ_res <- summary(civ_fit, vcov = vcovHC(civ_fit$iv_fit, type = "HC1"))

The CIV estimate and the corresponding standard error are shown below. The associated 95% confidence interval covers the true effect as indicated by the t-value of less than 1.96.

c(Estimate = civ_res$coef[2, 1], "Std. Error" = civ_res$coef[2, 2],
  "t-val." = abs(civ_res$coef[2, 1]-mean(tau_X[X]))/civ_res$coef[2, 2])
##   Estimate Std. Error     t-val. 
##  1.0063143  0.1086868  0.2409285

CIV uses a K-Conditional-Means (KCMeans) estimator in a first step to estimate the optimal instrument. To understand the estimated mapping of observed instruments to the support points of the latent instrument, it is useful to print the cluster_map attribute of the first-stage kcmeans_fit object (see also kcmeans for details). The code snippet below prints the results for the first 10 values of the instrument. Here, x denotes the value of the observed instrument while cluster_x denotes the association with the estimated optimal instrument.

t(head(civ_fit$kcmeans_fit$cluster_map[, c(1, 4)], 10))
##           [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
## x           26   20   10   32   23   12    7   25   33    21
## cluster_x    1    1    1    1    2    1    2    2    2     2

References

Wiemann T (2023). “Optimal Categorical Instruments.” https://arxiv.org/abs/2311.17021

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.