Getting started with baselinr

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

The problem

Every quasi-experimental impact study in education has to answer the same question before anyone looks at outcomes: were the treatment and comparison groups similar enough at baseline? The What Works Clearinghouse (WWC) sets the de facto standard for this in education research:

a covariate with a standardized mean difference (Hedges’ g) of 0.05 or less satisfies baseline equivalence on its own;
between 0.05 and 0.25, equivalence holds only if the covariate is statistically adjusted for in the impact model;
above 0.25, the covariate cannot establish equivalence.

baselinr computes those effect sizes and categories so the baseline table is not something you assemble by hand for every report.

A worked example

study <- data.frame(
  treat = c(1, 1, 1, 0, 0, 0),
  pretest = c(5, 6, 7, 4, 5, 6), # continuous -> Hedges' g
  female = c(1, 0, 1, 0, 0, 1) # binary     -> Cox index
)

baseline_equivalence(study, treatment = "treat")
#>   covariate       type n_treatment n_comparison mean_treatment mean_comparison
#> 1   pretest continuous           3            3      6.0000000       5.0000000
#> 2    female     binary           3            3      0.6666667       0.3333333
#>   sd_treatment sd_comparison effect_size  wwc_category
#> 1    1.0000000     1.0000000   0.8000000 not_satisfied
#> 2    0.5773503     0.5773503   0.8401784 not_satisfied

By default, every numeric, logical, and factor column other than the treatment indicator is treated as a covariate. A covariate with exactly two unique values is treated as binary and summarized with the Cox index; other numeric covariates use Hedges’ g. Pass covariates = to control the set explicitly.

The building blocks

baseline_equivalence() is built from exported helpers you can also call directly.

# Standardized mean difference (Hedges' g) for a continuous covariate
hedges_g(study$pretest, study$treat)
#> [1] 0.8

# Cox index for a binary covariate
cox_index(study$female, study$treat)
#> [1] 0.8401784

# Classify any effect size(s) into the WWC categories
wwc_classify(c(0.03, 0.12, 0.80))
#> [1] "satisfied"                 "satisfied_with_adjustment"
#> [3] "not_satisfied"

Visualise and format

A Love plot shows the standardized effect size of each covariate against the WWC thresholds (0.05 and 0.25), coloured by category:

love_plot(baseline_equivalence(study, treatment = "treat"))

For a report-ready table, gt_baseline() returns a formatted gt table:

gt_baseline(baseline_equivalence(study, treatment = "treat"))

Scope

Continuous covariates use Hedges’ g (with the WWC small-sample correction); binary covariates use the WWC Cox index. Collapse the table into an overall verdict with wwc_summary(), assess sample loss with attrition(), visualise with love_plot(), and format with gt_baseline(). See NEWS.md for the roadmap.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.