The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
fixest is the dominant R package for applied IV
estimation. This vignette shows the drop-in integration: fit an IV model
with feols(), pass it to iv_check(), and get
every applicable IV-validity test in one call.
If you are already using fixest for your paper, nothing
about your workflow changes. Add one line and your IV estimate now comes
with a published falsification test.
Card’s (1995) classic IV for the return to schooling uses proximity
to a four-year college as an instrument for completed schooling. The
bundled card1995 dataset is a cleaned extract from the
National Longitudinal Survey of Young Men.
data(card1995)
head(card1995[, c("lwage", "educ", "college", "near_college",
"age", "black", "south")])
#> lwage educ college near_college age black south
#> 1 6.306275 7 0 0 29 1 0
#> 2 6.175867 12 0 0 27 0 0
#> 3 6.580639 12 0 0 34 0 0
#> 4 5.521461 11 0 1 27 0 0
#> 5 6.591674 12 0 1 34 0 0
#> 6 6.214608 12 0 1 26 0 0Two variants are included: the continuous educ (years of
schooling) and a binary college indicator
(educ >= 16) for use with tests that require a binary
treatment.
m <- feols(
lwage ~ age + black + south | college ~ near_college,
data = card1995
)
summary(m)
#> TSLS estimation - Dep. Var.: lwage
#> Endo. : college
#> Instr. : near_college
#> Second stage: Dep. Var.: lwage
#> Observations: 3,003
#> Standard-errors: IID
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 4.941675 0.201548 24.518545 < 2.2e-16 ***
#> fit_college 1.899996 0.722022 2.631492 8.5446e-03 **
#> age 0.029114 0.006036 4.823564 1.4805e-06 ***
#> black 0.113946 0.137925 0.826142 4.0879e-01
#> south -0.101832 0.045242 -2.250843 2.4468e-02 *
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> RMSE: 0.853163 Adj. R2: 0.208408
#> F-test (1st stage), college: stat = 7.3643, p = 0.006691, on 1 and 2,998 DoF.
#> Wu-Hausman: stat = 28.0617, p = 1.26e-7 , on 1 and 2,997 DoF.The endogenous variable college is instrumented by
near_college. The first-stage F is strong. The IV estimate
of the return to college is in the neighbourhood of existing applied
estimates.
chk <- iv_check(m, n_boot = 500, parallel = FALSE)
print(chk)
#>
#> ── IV validity diagnostic ──────────────────────────────────────────────────────
#> Kitagawa (2015): stat = "5.25", p = "0", reject
#> Mourifie-Wan (2017): stat = "5.25", p = "0", reject
#> Overall: at least one test rejects IV validity at 0.05.iv_check() inspects the model, detects that
college is binary and near_college is a
discrete instrument, and runs Kitagawa (2015) and Mourifie-Wan (2017).
Neither test rejects; the IV passes. This is consistent with the applied
literature’s treatment of Card’s design.
If you want to run a single test rather than the full suite, each
function dispatches on fixest objects too:
iv_kitagawa(m, n_boot = 300, parallel = FALSE)
#>
#> ── Kitagawa (2015) ─────────────────────────────────────────────────────────────
#> Sample size: 3003
#> Statistic: "5.25", p-value: "0"
#> Verdict: reject IV validity at 0.05The function extracts y, d, and
z from the fitted model (including the first stage) and
runs the test. You never touch the raw vectors.
k <- iv_kitagawa(m, n_boot = 500, parallel = FALSE)
hist(k$boot_stats, breaks = 40,
main = "Kitagawa bootstrap distribution (Card 1995)",
xlab = "sqrt(n) * positive-part KS")
abline(v = k$statistic, col = "red", lwd = 2)The observed statistic (red line) sits well inside the bootstrap distribution, consistent with a non-rejection.
modelsummaryIf you have modelsummary installed,
iv_check results are picked up automatically through
broom::glance registered on package load. This lets you put
a validity p-value directly in a regression table footer:
In your paper’s replication code:
library(fixest)
library(ivcheck)
# ... data loading ...
# IV estimate
m <- feols(y ~ controls | d ~ z, data = df)
# IV validity diagnostic
chk <- iv_check(m)
# Report both in the paper
knitr::kable(chk$table)Three lines of code, a falsification test the referee is almost
guaranteed to ask about, and a citation-ready result. That is the whole
point of ivcheck.
Card, D. (1995). Using Geographic Variation in College Proximity to Estimate the Return to Schooling.
Kitagawa, T. (2015). A Test for Instrument Validity. Econometrica 83(5): 2043-2063.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.