Plot Grading and Testing with ggspec

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Introduction

ggspec provides a comparison tier (equiv_*()) and a check/assertion tier (check_plot(), expect_equiv_plot()) for comparing two ggplot objects. These are designed to be framework-agnostic: they work in plain R scripts, testthat test suites, and learnr/gradethis grading pipelines.

Checking visual equivalence is particularly important in the age of AI-assisted coding: different large-language models generate syntactically different code for the same visualisation task (geom_bar() on raw data vs geom_col() on pre-counted data; labs(x = ...) vs scale_x_continuous(name = ...)). ggspec provides a four-level hierarchy of equivalence checks so that functionally identical plots are recognised as equivalent regardless of how they were written.

library(ggspec)
library(ggplot2)

Comparing two plots with `equiv_plot()`

equiv_plot() is the high-level entry point. It accepts two ggplot objects and a character vector of check names to run. It returns a ggspec_result object that holds a pass/fail flag, a human-readable message, and a structured diff.

ref <- ggplot(mpg, aes(displ, hwy)) +
  geom_point(aes(colour = class)) +
  facet_wrap(~drv) +
  labs(title = "Reference plot")

obs_correct <- ggplot(mpg, aes(displ, hwy)) +
  geom_point(aes(colour = class)) +
  facet_wrap(~drv) +
  labs(title = "Reference plot")

obs_wrong <- ggplot(mpg, aes(displ, hwy)) +
  geom_smooth() +            # wrong geom
  facet_wrap(~cyl) +         # wrong facet variable
  labs(title = "Student plot")

# Passing case
result_ok <- equiv_plot(ref, obs_correct)
result_ok
#> [PASS mode=strict] 6/6 checks passed
#>   Detail:
#> # A tibble: 10 × 12
#>    check  source layer geom  stat   position aesthetic variable status label_ref
#>    <chr>  <chr>  <int> <chr> <chr>  <chr>    <chr>     <chr>    <chr>  <chr>    
#>  1 layers ref        0 <NA>  <NA>   <NA>     <NA>      <NA>     <NA>   <NA>     
#>  2 layers ref        1 point ident… identity <NA>      <NA>     <NA>   <NA>     
#>  3 layers obs        0 <NA>  <NA>   <NA>     <NA>      <NA>     <NA>   <NA>     
#>  4 layers obs        1 point ident… identity <NA>      <NA>     <NA>   <NA>     
#>  5 aes    global     0 <NA>  <NA>   <NA>     x         displ    match  <NA>     
#>  6 aes    global     0 <NA>  <NA>   <NA>     y         hwy      match  <NA>     
#>  7 aes    global     1 point <NA>   <NA>     x         displ    match  <NA>     
#>  8 aes    global     1 point <NA>   <NA>     y         hwy      match  <NA>     
#>  9 aes    local      1 point <NA>   <NA>     colour    class    match  <NA>     
#> 10 labels <NA>      NA <NA>  <NA>   <NA>     title     <NA>     <NA>   Referenc…
#> # ℹ 2 more variables: label_obs <chr>, match <lgl>
as.logical(result_ok)
#> [1] TRUE

# Failing case
result_fail <- equiv_plot(ref, obs_wrong)
result_fail
#> [FAIL mode=strict] 2/6 checks passed: Missing geom(s): point.; Aesthetic mapping issue(s): colour->class (layer 1).; Facet mismatch: cols: 'drv' vs 'cyl'; wrong label(s): 'title' (expected 'Reference plot', got 'Student plot')
#>   Detail:
#> # A tibble: 10 × 12
#>    check  source layer geom   stat  position aesthetic variable status label_ref
#>    <chr>  <chr>  <int> <chr>  <chr> <chr>    <chr>     <chr>    <chr>  <chr>    
#>  1 layers ref        0 <NA>   <NA>  <NA>     <NA>      <NA>     <NA>   <NA>     
#>  2 layers ref        1 point  iden… identity <NA>      <NA>     <NA>   <NA>     
#>  3 layers obs        0 <NA>   <NA>  <NA>     <NA>      <NA>     <NA>   <NA>     
#>  4 layers obs        1 smooth smoo… identity <NA>      <NA>     <NA>   <NA>     
#>  5 aes    local      1 point  <NA>  <NA>     colour    class    missi… <NA>     
#>  6 aes    global     0 <NA>   <NA>  <NA>     x         displ    match  <NA>     
#>  7 aes    global     0 <NA>   <NA>  <NA>     y         hwy      match  <NA>     
#>  8 aes    global     1 point  <NA>  <NA>     x         displ    match  <NA>     
#>  9 aes    global     1 point  <NA>  <NA>     y         hwy      match  <NA>     
#> 10 labels <NA>      NA <NA>   <NA>  <NA>     title     <NA>     <NA>   Referenc…
#> # ℹ 2 more variables: label_obs <chr>, match <lgl>

Running individual checks

Each equiv_*() function tests one dimension:

equiv_layers(ref, obs_wrong)
#> [FAIL] Missing geom(s): point.
#>   Hint: Add + geom_point() to the observed plot.
#>   Detail:
#> # A tibble: 4 × 5
#>   source layer geom   stat     position
#>   <chr>  <int> <chr>  <chr>    <chr>   
#> 1 ref        0 <NA>   <NA>     <NA>    
#> 2 ref        1 point  identity identity
#> 3 obs        0 <NA>   <NA>     <NA>    
#> 4 obs        1 smooth smooth   identity
equiv_facets(ref, obs_wrong)
#> [FAIL] Facet mismatch: cols: 'drv' vs 'cyl'
equiv_labels(ref, obs_wrong, aesthetics = "title")
#> [FAIL] wrong label(s): 'title' (expected 'Reference plot', got 'Student plot')
#>   Hint: Add labs(title = 'Reference plot') to the observed plot.
#>   Detail:
#> # A tibble: 1 × 4
#>   aesthetic label_ref      label_obs    match
#>   <chr>     <chr>          <chr>        <lgl>
#> 1 title     Reference plot Student plot FALSE

The `exact` argument

By default, equiv_layers() and equiv_aes() use subset matching: the observed plot must contain at least the layers/mappings of the reference. Set exact = TRUE to require an exact match.

obs_extra <- ref + geom_smooth()  # extra layer is fine by default
equiv_layers(ref, obs_extra)
#> [PASS] All expected geoms present.
#>   Detail:
#> # A tibble: 5 × 5
#>   source layer geom   stat     position
#>   <chr>  <int> <chr>  <chr>    <chr>   
#> 1 ref        0 <NA>   <NA>     <NA>    
#> 2 ref        1 point  identity identity
#> 3 obs        0 <NA>   <NA>     <NA>    
#> 4 obs        1 point  identity identity
#> 5 obs        2 smooth smooth   identity

equiv_layers(ref, obs_extra, exact = TRUE)  # fails: extra layer
#> [FAIL] Expected 1 layer(s) [point]; got 2 [point, smooth].
#>   Detail:
#> # A tibble: 5 × 5
#>   source layer geom   stat     position
#>   <chr>  <int> <chr>  <chr>    <chr>   
#> 1 ref        0 <NA>   <NA>     <NA>    
#> 2 ref        1 point  identity identity
#> 3 obs        0 <NA>   <NA>     <NA>    
#> 4 obs        1 point  identity identity
#> 5 obs        2 smooth smooth   identity

Framework-agnostic checking with `check_plot()`

check_plot() wraps equiv_plot() and calls a fail_fn if the check fails. The default fail_fn = stop makes it work anywhere.

# Passes silently
check_plot(obs_correct, ref, check = c("layers", "aes", "facets"))

# Fails with an informative error
check_plot(obs_wrong, ref, check = c("layers", "facets"))
#> Error in check_plot(obs_wrong, ref, check = c("layers", "facets")): 0/2 checks passed: Missing geom(s): point.; Facet mismatch: cols: 'drv' vs 'cyl'

Swapping in a learnr/gradethis fail function

In a learnr tutorial, swap the fail_fn and pass_fn arguments to use the grading framework’s own signalling functions (e.g. gradethis::fail / gradethis::pass):

# Inside a learnr grade_this() block:
check_plot(
  .result,
  expected = ref,
  check    = c("layers", "aes", "facets"),
  fail_fn  = your_grading_framework_fail_fn,
  pass_fn  = your_grading_framework_pass_fn
)

No hard dependency on any grading framework is required — fail_fn and pass_fn can be any functions with compatible signatures.

Using `expect_equiv_plot()` in `testthat`

testthat::test_that("student plot has correct layers and facets", {
  expect_equiv_plot(
    obs_correct,
    ref,
    check = c("layers", "aes", "facets")
  )
})

Inspecting the diff

Every equiv_*() result carries a $detail data frame for programmatic inspection:

result <- equiv_aes(ref, obs_wrong)
result$detail
#> # A tibble: 5 × 6
#>   layer geom  aesthetic variable source status 
#>   <int> <chr> <chr>     <chr>    <chr>  <chr>  
#> 1     1 point colour    class    local  missing
#> 2     0 <NA>  x         displ    global match  
#> 3     0 <NA>  y         hwy      global match  
#> 4     1 point x         displ    global match  
#> 5     1 point y         hwy      global match

Comparing layer parameters

equiv_params() checks whether a specific layer’s non-aesthetic parameters match, e.g. checking that a student used se = FALSE on geom_smooth().

p_ref   <- ggplot(mpg, aes(displ, hwy)) + geom_smooth(method = "lm", se = FALSE)
p_wrong <- ggplot(mpg, aes(displ, hwy)) + geom_smooth(method = "lm", se = TRUE)

equiv_params(p_ref, p_wrong, layer = 1L, params = "se")
#> [FAIL] Layer 1 parameter mismatch: se.

Canonicalisation-aware comparison with `compare_plots()`

equiv_plot() performs direct structural comparison. When two plots are semantically equivalent but written differently — different geoms for the same stat, reversed aesthetic axes, scale names vs labs() — use compare_plots(), which normalises both plots before comparing.

Modes

# "structural" — normalises geom_col → geom_bar, sorts layer order
compare_plots(p_ref, p_col, mode = "structural", check = "layers")

# "visual" — additionally absorbs coord_flip() and scale name → labs()
compare_plots(p_ref, p_flip, mode = "visual", check = c("layers", "aes", "coord"))

The result is a ggspec_compare object extending ggspec_result, with extra fields $canon_p1, $canon_p2 (the canonicalised specs) and $mode.

Using a mode in `check_plot()`

Pass mode to check_plot() to apply canonicalisation in grading pipelines:

# Passes for a student who used geom_col() instead of geom_bar()
check_plot(student_plot, ref,
           check = "layers",
           mode  = "structural")

# In learnr (swap fail_fn/pass_fn for your grading framework):
check_plot(.result, ref,
           check   = c("layers", "aes", "coord"),
           mode    = "visual",
           fail_fn = your_grading_fail_fn,
           pass_fn = your_grading_pass_fn)

What each mode covers

Mode	Normalisation rules applied
`"strict"`	None beyond what `spec_plot()` already does
`"structural"`	`geom_col` -> `geom_bar`; layer order sorted
`"visual"`	Structural + `coord_flip` absorbed; scale `name` -> `labs()`
`"pedagogical"`	Visual + histogram `bins`/`binwidth` flagged; `after_stat()` logged

The $changes tibble on a ggspec_canon object records every normalisation applied, making the comparison transparent:

c1 <- canon(p_flip, mode = "visual")
c1$changes   # shows the coord_flip rule and its x/y swap

For a full catalogue of which equivalence patterns require which mode, see vignette("equivalence-patterns").

Summary of available checks

Function	What it checks
`equiv_layers()`	Geom and stat per layer
`equiv_aes()`	Aesthetic-to-variable mappings
`equiv_scales()`	Explicitly added scales
`equiv_facets()`	Facet type and variables
`equiv_labels()`	Title, axis, and aesthetic labels
`equiv_coord()`	Coordinate system type
`equiv_params()`	Non-aesthetic layer parameters
`equiv_data()`	Data hash per layer
`equiv_plot()`	All of the above in one call (direct)
`compare_plots()`	Canonicalise then `equiv_plot()`

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.

Plot Grading and Testing with ggspec

Introduction

Comparing two plots with equiv_plot()

Running individual checks

The exact argument

Framework-agnostic checking with check_plot()

Swapping in a learnr/gradethis fail function

Using expect_equiv_plot() in testthat

Inspecting the diff

Comparing layer parameters

Canonicalisation-aware comparison with compare_plots()

Modes

Using a mode in check_plot()

What each mode covers

Summary of available checks

Comparing two plots with `equiv_plot()`

The `exact` argument

Framework-agnostic checking with `check_plot()`

Using `expect_equiv_plot()` in `testthat`

Canonicalisation-aware comparison with `compare_plots()`

Using a mode in `check_plot()`