The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
This vignette demonstrates how to use negative control outcomes to screen for residual confounding and compute a corresponding sensitivity bound using the causaldef package. Negative controls provide an empirical diagnostic for whether your adjustment strategy may have failed to remove confounding.
A negative control outcome (\(Y'\)) is a variable that:
The key insight is:
If your adjustment strategy correctly removes confounding, then the residual association between \(A\) and \(Y'\) should be zero.
If you observe a non-zero association between \(A\) and \(Y'\) after adjustment, this indicates that confounding remains and your causal estimates may be biased.
thm:nc_bound)The causaldef package combines two ingredients:
thm:nc_bound):\[\delta(\hat{K}) \leq \kappa \cdot \delta_{NC}(\hat{K})\]
where: - \(\delta(\hat{K})\) is the true deficiency (what we want to know) - \(\delta_{NC}(\hat{K})\) is a negative-control association proxy (what we can measure) - \(\kappa\) is an alignment constant reflecting how well \(Y'\) proxies for \(Y\)’s confounding
Let’s create a dataset where we have: - An unmeasured confounder \(U\) - An observed covariate \(W\) (correlated with \(U\)) - Binary treatment \(A\) - Outcome \(Y\) affected by \(A\) and \(U\) - Negative control \(Y'\) affected only by \(U\) (not \(A\))
library(causaldef)
set.seed(42)
n <- 500
# Unmeasured confounder
U <- rnorm(n)
# Observed covariate (partially captures U)
W <- 0.7 * U + rnorm(n, sd = 0.5)
# Treatment assignment (confounded by U via W)
ps_true <- plogis(0.3 + 0.8 * U)
A <- rbinom(n, 1, ps_true)
# True causal effect
beta_true <- 2.0
# Outcome (affected by A and U)
Y <- 1 + beta_true * A + 1.5 * U + rnorm(n)
# Negative control outcome (affected by U only, NOT by A)
Y_nc <- 0.5 + 1.2 * U + rnorm(n, sd = 0.8)
# Create data frame
df <- data.frame(W = W, A = A, Y = Y, Y_nc = Y_nc)We specify the causal problem including the negative control:
spec <- causal_spec(
data = df,
treatment = "A",
outcome = "Y",
covariates = "W",
negative_control = "Y_nc"
)
#> ✔ Created causal specification: n=500, 1 covariate(s)
print(spec)
#>
#> -- Causal Specification --------------------------------------------------
#>
#> * Treatment: A ( binary )
#> * Outcome: Y ( continuous )
#> * Covariates: W
#> * Sample size: 500
#> * Estimand: ATE
#> * Negative control: Y_ncNow we test whether our IPTW adjustment successfully removes confounding:
The diagnostic returns:
screening$statistic: Weighted residual association between \(A\) and \(Y'\) after adjustmentp_value: Permutation p-value for that residual associationdelta_nc: The observed negative-control association proxydelta_bound: Upper bound on true deficiency (\(\kappa \times \delta_{NC}\))falsified: Whether the residual-association screening test rejectsIf \(W\) fully captures \(U\), the negative control test will NOT falsify:
When \(W\) is a poor proxy for \(U\), falsification occurs:
The best negative control outcomes have:
| Domain | Treatment | Outcome | Possible Negative Control |
|---|---|---|---|
| Cardiovascular | Statin use | CVD events | Accidental injuries |
| Oncology | Chemotherapy | Tumor response | Hospital-acquired infections |
| Economics | Job training | Earnings in 1978 | Earnings in 1974 (pre-treatment) |
| Epidemiology | Vaccination | Flu incidence | Unrelated disease incidence |
The negative control diagnostic complements deficiency estimation:
# Step 1: Estimate deficiency
def_results <- estimate_deficiency(
spec,
methods = c("unadjusted", "iptw", "aipw"),
n_boot = 100
)
print(def_results)
# Step 2: Run negative control diagnostic on best method
best_method <- names(which.min(def_results$estimates))
nc_check <- nc_diagnostic(spec, method = best_method, n_boot = 100)
# Step 3: Compute policy bounds if assumptions not falsified
if (!nc_check$falsified) {
bounds <- policy_regret_bound(
def_results,
utility_range = c(-5, 10),
method = best_method
)
print(bounds)
} else {
warning("Causal assumptions falsified. Consider additional covariates.")
}The alignment constant \(\kappa\) affects the bound’s tightness. The default \(\kappa = 1\) is conservative. You can estimate \(\kappa\) from domain knowledge:
| Function | Purpose |
|---|---|
nc_diagnostic() |
Screen for residual association and compute a sensitivity bound |
delta_nc |
Observable negative-control association proxy |
delta_bound |
Upper bound on true deficiency |
falsified |
Screening rejection of residual association |
Negative control diagnostics provide a data-driven way to assess causal assumptions. Use them alongside deficiency estimation for robust causal inference.
Akdemir, D. (2026). Constraints on Causal Inference as Experiment Comparison. DOI: 10.5281/zenodo.18367347. See thm:nc_bound (Negative Control Sensitivity Bound).
Lipsitch, M., Tchetgen, E., & Cohen, T. (2010). Negative controls: A tool for detecting confounding and bias. Epidemiology, 21(3), 383-388.
Shi, X., Miao, W., & Tchetgen Tchetgen, E. (2020). A selective review of negative control methods. Current Epidemiology Reports, 7, 190-202.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.