Introduction

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Introduction

The deltapif R package calculates Potential Impact Fractions (PIF) and Population Attributable Fractions (PAF) for aggregated data. It uses the delta method to derive confidence intervals, providing a robust approach for quantifying the burden of disease attributable to risk factors and the potential impact of interventions.

Core Concepts: PAF and PIF

The Population Attributable Fraction (PAF) answers the question: “What fraction of disease cases in a population would be prevented if we completely eliminated a risk factor?” It represents the maximum possible reduction achievable and is often interpreted as the burden of disease attributed to the exposure.

The Potential Impact Fraction (PIF) is a more general measure. It answers: “What fraction of disease cases would be prevented if we changed exposure from its current distribution to a specific counterfactual scenario?”

PAF is a specific type of PIF where the counterfactual scenario is the theoretical minimum risk exposure level (TMREL). We remark that the TMREL is not always zero. For example, the TMREL for sodium intake is a specific healthy range (e.g., ~1.6g/day), as both too much and too little sodium are harmful. For other exposures, such as smoking, the TMREL can indeed be zero exposure.

Note: The statistical methods underlying the packge assume that the relative risk and exposure prevalence estimates are independent (i.e., derived from different studies or populations).

Key assumption: Independent (summary) data sources.

The deltapif package is designed for a specific, common scenario in public health:

The estimate of the log-relative risk (beta) comes from one source (e.g., a published meta-analysis).
The estimate of the exposure prevalence (p) comes from a separate, independent source (e.g., a national survey).

The delta method implementation here relies on this independence. If you have individual-level data for exposure the pifpaf package is more appropriate as it leverages the individual-level variability. If individual-level exposure and outcome data is available from the same source the graphPAF package is ideal.

Usage

Population Attributable Fraction (PAF)

Lee et al. (2022) estimated the fraction of dementia cases attributable to smoking in the US. They reported:

A relative risk of 1.59 (95% CI: 1.15, 2.20)
A smoking prevalence of 8.5%

The point estimate of the PAF can be calculated using Levin’s formula:

library(deltapif)

paf(p = 0.085, beta = log(1.59), quiet = TRUE)
#> 
#> ── Population Attributable Fraction: [deltapif-0647286164263977] ──
#> 
#> PAF = 4.776% [95% CI: 4.776% to 4.776%]
#> standard_deviation(paf %) = 0.000

Incorporating Uncertainty

To calculate confidence intervals, we need the variance of the log-relative risk. The variance can be derived from the confidence interval following the Cochrane Handbook:

var_log_rr <- ((log(2.20) - log(1.15)) / (2 * 1.96))^2
var_log_rr
#> [1] 0.0273848

We then provide the log-relative risk (log(1.59)) and its variance to paf(), specifying the rr_link as exp to convert the coefficient to a relative risk by exponentiating the log. Since the prevalence variance was not reported, we assume var_p = 0.

paf_dementia <- paf(
  p         = 0.085, 
  beta      = log(1.59), 
  var_beta  = var_log_rr, 
  var_p     = 0
)
paf_dementia
#> 
#> ── Population Attributable Fraction: [deltapif-00694582915315349] ──
#> 
#> PAF = 4.776% [95% CI: 0.717% to 8.669%]
#> standard_deviation(paf %) = 2.028

The results match those reported by Lee et al.: PAF = 4.9% (95% CI: 1.3–9.3).

Potential Impact Fraction (PIF)

Lee et al. (2022) also considered a scenario reducing smoking prevalence by 15% (from 8.5% to 7.225%). The PIF for this intervention is:

lee_pif <- pif(
  p        = 0.085, 
  p_cft    = 0.085 * (1 - 0.15), # 15% reduction
  beta     = log(1.59), 
  var_beta = var_log_rr, 
  var_p    = 0
)
lee_pif
#> 
#> ── Potential Impact Fraction: [deltapif-0661762595177878] ──
#> 
#> PIF = 0.716% [95% CI: 0.118% to 1.311%]
#> standard_deviation(pif %) = 0.304

This result is consistent with the reported estimate: PIF = 0.7% (95% CI: 0.2–1.4).

Attributable and averted cases

Attributable and averted cases can be calculated with the attributable_cases function. For example Dhana et al. (2023) estimate the number of people with Alzheimer’s Disease in New York, USA 426.5 (400.2, 452.7) thousand. This implies a variance of ((452.7 - 400.2) / 2*qnorm(0.975))^2 = 2647.005.

The number of cases (in thousands) that would be averted if we reduced smoking by 15% assuming the prevalence of smoking is identical to the rest of the US is given by:

averted_cases(426.5, lee_pif, variance = 2647.005)
#> 
#> ── Averted cases: [deltapif-0661762595177878] ──
#> 
#> Averted cases = 3.055 [95% CI: 0.394 to 5.716]
#> standard_deviation(averted cases) = 135.779

Attributable cases can likewise be estimated using the previous paf as:

attributable_cases(426.5, paf_dementia, variance = 2647.005)
#> 
#> ── Attributable cases: [deltapif-00694582915315349] ──
#> 
#> Attributable cases = 20.368 [95% CI: 2.626 to 38.109]
#> standard_deviation(attributable cases) = 905.195

Combining fractions from subpopulations

Multiple fractions can be combined into totals and ensembles. For example the fraction among men and women can be combined into an overall fraction by specifying the distribution of the subgroups in the population:

paf_men   <- paf(p = 0.41, beta = 0.31, var_p = 0.001,
                 var_beta = 0.14,
                 label = "Men")
paf_women <- paf(p = 0.37, beta = 0.35, var_p = 0.001, 
                 var_beta = 0.16,
                 label = "Women")

Assuming the distribution is 51% women and 49% men:

paf_total(paf_men, paf_women, weights = c(0.49, 0.51))
#> 
#> ── Population Attributable Fraction: [deltapif-0136719440904713] ──
#> 
#> PAF = 13.201% [95% CI: 10.473% to 15.845%]
#> standard_deviation(paf %) = 11.187
#> ────────────────────────────────── Components: ─────────────────────────────────
#> • 12.968% (sd %: 15.867) --- [Men]
#> • 13.424% (sd %: 15.773) --- [Women]
#> ────────────────────────────────────────────────────────────────────────────────

This is equivalent to calculating:

\[ \textrm{PAF}_{\text{All}} = 0.49 \cdot \text{PAF}_{\text{Men}} + 0.51 \cdot \text{PAF}_{\text{Women}} \]

Combining fractions from multiple risks

Fractions from disjointed risks can be calculated as an ensemble. For example the fraction of exposure to lead and the fraction of exposure to asbestus:

paf_lead  <- paf(p = 0.41, beta = 0.31, var_p = 0.001,
                 var_beta = 0.014,
                 label = "Lead")
paf_absts <- paf(p = 0.61, beta = 0.15, var_p = 0.001, 
                 var_beta = 0.001,
                 label = "Asbestus")

A fraction of environmental exposure considering both can be calculated by multiplying the inverse of the fractions, assuming a commonality correction (say of c(0.1, 0.2)):

paf_ensemble(paf_lead, paf_absts, weights = c(0.1, 0.2))
#> 
#> ── Population Attributable Fraction: [deltapif-071645260952967] ──
#> 
#> PAF = 3.070% [95% CI: 3.033% to 3.108%]
#> standard_deviation(paf %) = 0.625
#> ────────────────────────────────── Components: ─────────────────────────────────
#> • 12.968% (sd %: 5.085) --- [Lead]
#> • 8.985% (sd %: 1.904) --- [Asbestus]
#> ────────────────────────────────────────────────────────────────────────────────

where this quantity estimates:

\[ \textrm{PAF}_{\text{Ensemble}} = 1 - (1 - 0.1 \cdot \textrm{PAF}_{\text{Lead}}) \cdot (1 - 0.2 \cdot \textrm{PAF}_{\text{Asbestus}}) \]

Adjusting fractions for commonality

Adjuting for commonality is usually performed when different risks can be concurrent. In the previous example, exposure to lead and to asbestus can happen at the same time. Mukadam et al. (2019) propose the individual weighted (adjusted) fractions based on commonality weights. These weights represent the proportion of the variance shared among risk factors. To calculate the adjusted fractions one needs to estimate:

\[ \textrm{PIF}_k^{\text{Adjusted}} = \dfrac{\text{PIF}_k}{\sum_k \text{PIF}_k} \cdot \text{PIF}_{\text{Overall}} \] where

\[ \textrm{PIF}^{\text{Overall}} = 1 - \prod\limits_k (1 - w_k \text{PIF}_k) \] with

\[ w_k = 1 - \text{commonality}_k \]

The adjusted fractions can be calculated with the weighted_adjusted as:

weighted_adjusted_paf(paf_lead, paf_absts, weights = c(0.2, 0.3))
#> $Lead
#> 
#> ── Population Attributable Fraction: [Lead_adj] ──
#> 
#> PAF = 3.083% [95% CI: 0.964% to 5.202%]
#> standard_deviation(paf %) = 1.081
#> 
#> $Asbestus
#> 
#> ── Population Attributable Fraction: [Asbestus_adj] ──
#> 
#> PAF = 2.136% [95% CI: 1.294% to 2.978%]
#> standard_deviation(paf %) = 0.429

which returns a named list of the adjusted fractions.

Additional information

Read the examples vignette

References

Dhana, Klodian, Todd Beck, Pankaja Desai, Robert S Wilson, Denis A Evans, and Kumar B Rajan. 2023. “Prevalence of Alzheimer’s Dementia in the 50 US States and 3142 Counties: A Population Estimate Using the 2020 Bridged-Race Postcensal from the National Center for Health Statistics.” Alzheimer’s & Dementia 19: e074430.

Lee, Mark, Eric Whitsel, Christy Avery, Timothy M Hughes, Michael E Griswold, Sanaz Sedaghat, Rebecca F Gottesman, Thomas H Mosley, Gerardo Heiss, and Pamela L Lutsey. 2022. “Variation in Population Attributable Fraction of Dementia Associated with Potentially Modifiable Risk Factors by Race and Ethnicity in the US.” JAMA Network Open 5 (7): e2219672–72.

Mukadam, Naaheed, Andrew Sommerlad, Jonathan Huntley, and Gill Livingston. 2019. “Population Attributable Fractions for Risk Factors for Dementia in Low-Income and Middle-Income Countries: An Analysis Using Cross-Sectional Survey Data.” The Lancet Global Health 7 (5): e596–603.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.