ARL by Monte Carlo simulation

The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

The Average Run Length (ARL) of a control chart is the expected number of samples until the chart signals. It is the single most important operating characteristic — it tells you, on average, how quickly the chart catches a real shift, and how often it cries wolf.

These two are in tension. Adding more rules to a chart sharpens detection (lower \(\mathrm{ARL}_1\)) but increases false alarms (lower \(\mathrm{ARL}_0\)). shewhart_arl() quantifies the trade-off.

Closed-form benchmarks

For the simplest setup — Nelson 1 only, normal data, 3-sigma limits — the ARL is known in closed form. The probability of falling outside 3-sigma is \(2 \cdot \Phi(-3) \approx 0.0027\), so:

\[ \mathrm{ARL}_0 = \frac{1}{0.0027} \approx 370.4. \]

Let’s verify by simulation:

set.seed(2025)
shewhart_arl(
  shift = 0,
  rules = "nelson_1_beyond_3s",
  n_sim = 5000,
  max_run = 2000
)

The estimate should be close to 370 (Monte Carlo error around the closed form).

Adding rules sharpens detection

What happens if we add Nelson 2 (9 points same side) on top?

set.seed(2025)
shewhart_arl(
  shift = 0,
  rules = c("nelson_1_beyond_3s", "nelson_2_nine_same"),
  n_sim = 5000
)

The ARL_0 drops from ~370 to ~250, meaning false alarms become more common. Whether that’s worth it depends on what we get in detection power. Let’s plot ARL across a grid of shifts:

shifts <- seq(0, 3, by = 0.25)

set.seed(2025)
arl_n1 <- shewhart_arl(shifts, "nelson_1_beyond_3s", n_sim = 2000)
arl_n12 <- shewhart_arl(shifts,
                        c("nelson_1_beyond_3s", "nelson_2_nine_same"),
                        n_sim = 2000)

bind_rows(
  arl_n1  |> mutate(rules = "Nelson 1 only"),
  arl_n12 |> mutate(rules = "Nelson 1 + 2")
) |>
  ggplot(aes(shift, arl, colour = rules)) +
  geom_line(linewidth = 0.7) +
  geom_ribbon(aes(ymin = arl_lower, ymax = arl_upper, fill = rules),
              alpha = 0.12, colour = NA) +
  scale_y_log10() +
  scale_colour_manual(
    values = c(`Nelson 1 only` = unname(shewhart_palette("signal")["in_control"]),
               `Nelson 1 + 2`  = unname(shewhart_palette("family")["memory_based"]))) +
  scale_fill_manual(
    values = c(`Nelson 1 only` = unname(shewhart_palette("signal")["in_control"]),
               `Nelson 1 + 2`  = unname(shewhart_palette("family")["memory_based"]))) +
  labs(x = "Shift (sigma)", y = "ARL (log scale)",
       title = "Operating characteristics") +
  shewhart_theme()

The gain is largest for small shifts (around 0.5-1 sigma), where Nelson 1 alone takes a long time to fire. For shifts of 2 sigma or more, both rule sets detect within a couple of samples.

A quantitative comparison: WE-7 vs Nelson 2

The original Shewhart package (now shewhartr) (v0.1.x) used a Western Electric “7 in a row” rule for phase detection. The default in v1.0 is Nelson 2 (9 in a row). The trade-off:

set.seed(2025)
arl_we  <- shewhart_arl(0, "we_seven_same",       n_sim = 5000, max_run = 2000)
arl_n2  <- shewhart_arl(0, "nelson_2_nine_same",  n_sim = 5000, max_run = 2000)
arl_we
arl_n2

The WE rule’s ARL_0 is around 64 (a false phase change every 64 in-control samples on average); Nelson 2’s is around 256 (one every ~256 samples). For most monitoring contexts that is a meaningful difference. The WE rule is easier to teach, but the false-alarm cost is high.

shewhart_regression() accepts either via the phase_rule argument:

shewhart_regression(temperature_drift, value = temp_c, index = minute,
                    phase_rule = "we_seven_same")        # legacy default
shewhart_regression(temperature_drift, value = temp_c, index = minute,
                    phase_rule = "nelson_2_nine_same")    # new default

Beyond normal residuals

shewhart_arl() simulates from a normal distribution by default. Real residuals are often non-normal — heavy-tailed, skewed, or autocorrelated. The closed-form ARL_0 of 370 for Nelson 1 then overstates how rarely false alarms occur. A bootstrap variant (simulating from your actual residuals) is on the roadmap; until then, a quick check is to call shewhart_diagnostics(fit) and inspect the Q-Q and ACF panels.

References

Champ, C. W., & Woodall, W. H. (1987). Exact Results for Shewhart Control Charts with Supplementary Runs Rules. Technometrics, 29(4), 393-399.
Wald, A. (1947). Sequential Analysis. Wiley.
Page, E. S. (1954). Continuous Inspection Schemes. Biometrika, 41(1-2), 100-115.
Lucas, J. M., & Saccucci, M. S. (1990). Exponentially Weighted Moving Average Control Schemes: Properties and Enhancements. Technometrics, 32(1), 1-12.

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.