Negative Control Diagnostics in causaldef

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Deniz Akdemir

2026-03-26

Introduction

This vignette demonstrates how to use negative control outcomes to screen for residual confounding and compute a corresponding sensitivity bound using the causaldef package. Negative controls provide an empirical diagnostic for whether your adjustment strategy may have failed to remove confounding.

Theoretical Background

What is a Negative Control Outcome?

A negative control outcome ($Y'$) is a variable that:

Shares confounders with the true outcome $Y$ — it is affected by the same unmeasured variables $U$ that confound the treatment-outcome relationship
Is NOT causally affected by treatment $A$ — the true causal effect of $A$ on $Y'$ is zero

The Diagnostic Logic

The key insight is:

If your adjustment strategy correctly removes confounding, then the residual association between $A$ and $Y'$ should be zero.

If you observe a non-zero association between $A$ and $Y'$ after adjustment, this indicates that confounding remains and your causal estimates may be biased.

Negative Control Sensitivity Bound (manuscript `thm:nc_bound`)

The causaldef package combines two ingredients:

a screening test for residual association between treatment and the negative control after adjustment, and
the manuscript’s negative control sensitivity bound (thm:nc_bound):

\[\delta(\hat{K}) \leq \kappa \cdot \delta_{NC}(\hat{K})\]

where: - $\delta(\hat{K})$ is the true deficiency (what we want to know) - $\delta_{NC}(\hat{K})$ is a negative-control association proxy (what we can measure) - $\kappa$ is an alignment constant reflecting how well $Y'$ proxies for $Y$’s confounding

Practical Example

Simulating Data with a Negative Control

Let’s create a dataset where we have: - An unmeasured confounder $U$ - An observed covariate $W$ (correlated with $U$) - Binary treatment $A$ - Outcome $Y$ affected by $A$ and $U$ - Negative control $Y'$ affected only by $U$ (not $A$)

library(causaldef)
set.seed(42)

n <- 500

# Unmeasured confounder
U <- rnorm(n)

# Observed covariate (partially captures U)
W <- 0.7 * U + rnorm(n, sd = 0.5)

# Treatment assignment (confounded by U via W)
ps_true <- plogis(0.3 + 0.8 * U)
A <- rbinom(n, 1, ps_true)

# True causal effect
beta_true <- 2.0

# Outcome (affected by A and U)
Y <- 1 + beta_true * A + 1.5 * U + rnorm(n)

# Negative control outcome (affected by U only, NOT by A)
Y_nc <- 0.5 + 1.2 * U + rnorm(n, sd = 0.8)

# Create data frame
df <- data.frame(W = W, A = A, Y = Y, Y_nc = Y_nc)

Creating the Causal Specification

We specify the causal problem including the negative control:

spec <- causal_spec(
  data = df,
  treatment = "A",
  outcome = "Y",
  covariates = "W",
  negative_control = "Y_nc"
)
#> ✔ Created causal specification: n=500, 1 covariate(s)

print(spec)
#> 
#> -- Causal Specification --------------------------------------------------
#> 
#> * Treatment: A ( binary )
#> * Outcome: Y ( continuous )
#> * Covariates: W 
#> * Sample size: 500 
#> * Estimand: ATE 
#> * Negative control: Y_nc

Running the Negative Control Diagnostic

Now we test whether our IPTW adjustment successfully removes confounding:

nc_result <- nc_diagnostic(
  spec,
  method = "iptw",
  alpha = 0.05,
  n_boot = 200
)

print(nc_result)

Interpreting the Results

The diagnostic returns:

screening$statistic: Weighted residual association between $A$ and $Y'$ after adjustment
p_value: Permutation p-value for that residual association
delta_nc: The observed negative-control association proxy
delta_bound: Upper bound on true deficiency ($\kappa \times \delta_{NC}$)
falsified: Whether the residual-association screening test rejects

Scenarios

Scenario 1: Adjustment Succeeds

If $W$ fully captures $U$, the negative control test will NOT falsify:

# When W = U (no unmeasured confounding)
df_full <- df
df_full$W <- U  # Perfect proxy

spec_full <- causal_spec(
  df_full, "A", "Y", "W", negative_control = "Y_nc"
)

nc_full <- nc_diagnostic(spec_full, method = "iptw", n_boot = 100)
print(nc_full)
# Expect: falsified = FALSE

Scenario 2: Adjustment Fails

When $W$ is a poor proxy for $U$, falsification occurs:

# When W is noise (no information about U)
df_bad <- df
df_bad$W <- rnorm(n)  # Useless proxy

spec_bad <- causal_spec(
  df_bad, "A", "Y", "W", negative_control = "Y_nc"
)

nc_bad <- nc_diagnostic(spec_bad, method = "iptw", n_boot = 100)
print(nc_bad)
# Expect: falsified = TRUE

Choosing Good Negative Control Outcomes

Ideal Properties

The best negative control outcomes have:

Strong confounding alignment: $Y'$ shares the same unmeasured confounders as $Y$
Zero treatment effect: No plausible mechanism by which $A$ affects $Y'$
Measurable: Available in your dataset

Examples by Domain

Domain	Treatment	Outcome	Possible Negative Control
Cardiovascular	Statin use	CVD events	Accidental injuries
Oncology	Chemotherapy	Tumor response	Hospital-acquired infections
Economics	Job training	Earnings in 1978	Earnings in 1974 (pre-treatment)
Epidemiology	Vaccination	Flu incidence	Unrelated disease incidence

Combining with Deficiency Estimation

The negative control diagnostic complements deficiency estimation:

# Step 1: Estimate deficiency
def_results <- estimate_deficiency(
  spec,
  methods = c("unadjusted", "iptw", "aipw"),
  n_boot = 100
)

print(def_results)

# Step 2: Run negative control diagnostic on best method
best_method <- names(which.min(def_results$estimates))
nc_check <- nc_diagnostic(spec, method = best_method, n_boot = 100)

# Step 3: Compute policy bounds if assumptions not falsified
if (!nc_check$falsified) {
  bounds <- policy_regret_bound(
    def_results,
    utility_range = c(-5, 10),
    method = best_method
  )
  print(bounds)
} else {
  warning("Causal assumptions falsified. Consider additional covariates.")
}

Advanced: Estimating Kappa

The alignment constant $\kappa$ affects the bound’s tightness. The default $\kappa = 1$ is conservative. You can estimate $\kappa$ from domain knowledge:

# If you believe Y' has 80% of Y's confounding structure:
nc_tight <- nc_diagnostic(
  spec,
  method = "iptw",
  kappa = 0.8,
  n_boot = 100
)

print(nc_tight)

Summary

Function	Purpose
`nc_diagnostic()`	Screen for residual association and compute a sensitivity bound
`delta_nc`	Observable negative-control association proxy
`delta_bound`	Upper bound on true deficiency
`falsified`	Screening rejection of residual association

Negative control diagnostics provide a data-driven way to assess causal assumptions. Use them alongside deficiency estimation for robust causal inference.

References

Akdemir, D. (2026). Constraints on Causal Inference as Experiment Comparison. DOI: 10.5281/zenodo.18367347. See thm:nc_bound (Negative Control Sensitivity Bound).
Lipsitch, M., Tchetgen, E., & Cohen, T. (2010). Negative controls: A tool for detecting confounding and bias. Epidemiology, 21(3), 383-388.
Shi, X., Miao, W., & Tchetgen Tchetgen, E. (2020). A selective review of negative control methods. Current Epidemiology Reports, 7, 190-202.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.

Negative Control Diagnostics in causaldef

Deniz Akdemir

2026-03-26

Introduction

Theoretical Background

What is a Negative Control Outcome?

The Diagnostic Logic

Negative Control Sensitivity Bound (manuscript thm:nc_bound)

Practical Example

Simulating Data with a Negative Control

Creating the Causal Specification

Running the Negative Control Diagnostic

Interpreting the Results

Scenarios

Scenario 1: Adjustment Succeeds

Scenario 2: Adjustment Fails

Choosing Good Negative Control Outcomes

Ideal Properties

Examples by Domain

Combining with Deficiency Estimation

Advanced: Estimating Kappa

Summary

References

Negative Control Sensitivity Bound (manuscript `thm:nc_bound`)