---
title: "Breakpoint Selection with Spike-and-Slab Priors"
output: rmarkdown::html_vignette
vignette: >
  %\VignetteIndexEntry{Breakpoint Selection with Spike-and-Slab Priors}
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include=FALSE}
NOT_CRAN <- identical(tolower(Sys.getenv("NOT_CRAN")), "true")
knitr::opts_chunk$set(collapse = TRUE, comment = "#>", eval = NOT_CRAN)
```

## When to use spike-and-slab

When fitting multi-breakpoint models, a central question is: *how many
change-points does the data actually support?* `smoothbp_ss()` addresses
this with a **Kuo & Mallick (1998) spike-and-slab** prior on each slope-change
coefficient $\delta_k$.

Each $\delta_k$ is paired with a binary inclusion indicator $\gamma_k$:

- **$\gamma_k = 0$ (spike)**: $\delta_k$ is structurally zeroed — that
  breakpoint has no slope change and does not contribute to the mean.
- **$\gamma_k = 1$ (slab)**: $\delta_k$ follows a diffuse normal prior
  and participates in the model.

The posterior mean of each $\gamma_k$ is the **posterior inclusion
probability (PIP)** — the Bayesian probability that change-point $k$ is
real. A PIP near 1 means strong evidence for that breakpoint; a PIP near 0
means the data are consistent with no structural break there.

> **Tip**: If you already know where a breakpoint occurred (e.g., a policy change or an intervention), you don't need to estimate its location. Use the `fixed()` helper to test the intervention effect — see `vignette("intervention-analysis")`.

---

## Example 1: Identifying the number of real breakpoints

### Simulate data with one true breakpoint

```{r sim-one-bp}
library(smoothbp)
library(posterior)
library(ggplot2)
library(dplyr)
library(tidyr)

set.seed(42)
n  <- 200
tau <- seq(0, 10, length.out = n)

# True model: one breakpoint at omega = 5, delta = -1.0
om_true    <- 5
rho_true   <- 4
b0_true    <- 2
b1_true    <- 0.5
delta_true <- -1.0

di  <- tau - om_true
si  <- plogis(di * rho_true)
mu  <- b0_true + b1_true * di + delta_true * di * si
y   <- mu + rnorm(n, sd = 0.2)
dat <- data.frame(y = y, tau = tau)
```

### Fit a 3-breakpoint SS model

We let the model have 3 candidate breakpoints but expect only one to
receive a high PIP:

```{r fit-ss-one}
fit_ss <- smoothbp_ss(
  formula = y ~ tau,
  b0      = ~ 1,
  b1      = ~ 1,
  deltas  = list(~ 1, ~ 1, ~ 1),
  omega   = list(~ 1, ~ 1, ~ 1),
  rho     = list(~ 1, ~ 1, ~ 1),
  data    = dat,
  spike   = prior_spike_slab(pi = 0.1, learn_pi = TRUE),
  priors  = smoothbp_priors(
    omega = space_omega_priors(K = 3, tau_min = 0, tau_max = 10)
  ),
  chains = 2, iter = 2000, warmup = 1000, seed = 42
)
```

### Inspect posterior inclusion probabilities

```{r pip-one}
pip(fit_ss)
#>  gamma_b1_(Intercept) gamma_delta1_(Intercept) gamma_delta2_(Intercept) gamma_delta3_(Intercept)
#>                  1.00                     0.06                     0.98                     0.04

# Full posterior summary (just the gammas)
summarise_draws(fit_ss$draws) |>
  filter(grepl("^gamma_delta", variable))
```

The second candidate breakpoint has a PIP near 1, matching the data-generating
truth (omega = 5 maps to the second candidate position). The other two
breakpoints are effectively excluded.

---

## Example 2: Regularization prevents overfitting

We demonstrate that fitting a 3-BP model to 1-BP data does **not** overfit
when using spike-and-slab priors.

### Visualise posterior inclusion probabilities

```{r pip-plot}
# The plot method automatically computes and plots the PIPs along with their 95% HDI
plot(pip(fit_ss))
```

---

## Example 3: Visualising model fit

The spike-and-slab model allows us to perform model selection and estimation
simultaneously. We can overlay the posterior mean trajectory and its
95% credible interval on the true data-generating function:

```{r fit-plot}
# Get posterior predictions (mean and intervals)
pred <- fitted(fit_ss)

plot_df <- dat %>%
  mutate(
    mu_true = mu,  # 'mu' is from the simulation step
    mu_fit  = pred$fitted_mean,
    lo      = pred$fitted_Q2.5,
    hi      = pred$fitted_Q97.5
  )

ggplot(plot_df, aes(x = tau)) +
  geom_point(aes(y = y), alpha = 0.3, size = 1) +
  geom_ribbon(aes(ymin = lo, ymax = hi), fill = "#2c7bb6", alpha = 0.2) +
  geom_line(aes(y = mu_true, colour = "Truth"), linewidth = 1) +
  geom_line(aes(y = mu_fit,  colour = "smoothbp_ss"), linewidth = 0.8, linetype = "dashed") +
  scale_colour_manual(values = c("Truth" = "black", "smoothbp_ss" = "#d7191c")) +
  labs(
    title = "Model fit: Truth vs Posterior Predictions",
    subtitle = "The SS model correctly ignores the 2 redundant breakpoints and recovers the trajectory.",
    x = "Time (tau)", y = "Response (y)", colour = NULL
  ) +
  theme_minimal()
```

---

## Example 4: Learning the inclusion probability

When you have many candidate breakpoints, setting a fixed prior $\pi$ can
be influential. Setting `learn_pi = TRUE` places a Beta hyperprior on a
shared $\pi$ and lets the data inform the overall sparsity level.

```{r fit-learn-pi}
fit_lpi <- smoothbp_ss(
  formula = y ~ tau,
  b0      = ~ 1,
  b1      = ~ 1,
  deltas  = rep(list(~ 1), 5),   # 5 candidate breakpoints
  omega   = rep(list(~ 1), 5),
  rho     = rep(list(~ 1), 5),
  data    = dat,
  spike   = prior_spike_slab(
    learn_pi = TRUE,
    a        = 1,   # Beta(1, 4) prior on pi: mean = 0.2
    b        = 4
  ),
  chains = 2, iter = 3000, warmup = 1500, seed = 42
)

# Posterior for pi (sparsity level)
pi_draws <- as.numeric(as_draws_matrix(fit_lpi$draws)[, "pi"])
quantile(pi_draws, c(0.025, 0.5, 0.975))
#>  2.5%  50%  97.5%
#>  0.04 0.20   0.47
# Contracted toward sparsity: only ~1/5 breakpoints are real
```

---

## Practical guidance

### Choosing the slab width

The slab standard deviation controls the range of plausible non-zero slope
changes. Match the slab width to your expected effect scale:

| Expected slope change | Suggested slab |
|-----------------------|----------------|
| Small (< 0.5)        | `prior_normal(0, 0.5)` |
| Moderate (0.5–2)     | `prior_normal(0, 1)` |
| Large / uncertain     | `prior_normal(0, 2)` |

### Interpreting PIPs

| PIP | Interpretation |
|-----|----------------|
| > 0.95 | Strong evidence for that breakpoint |
| 0.75–0.95 | Moderate evidence |
| 0.25–0.75 | Inconclusive |
| < 0.25 | Little evidence — breakpoint not needed |

For formal model selection, the **median probability model** includes all
breakpoints with PIP > 0.5.

### Diagnosing the sampler

The gamma indicators should switch between 0 and 1 frequently during sampling.
Check with:

```{r diag-ss}
# Gamma trace plots — look for healthy switching
trace_plot(fit_ss, pars = grep("^gamma_delta", posterior::variables(fit_ss$draws), value = TRUE))
```

If a gamma is stuck at 1 across all iterations, consider:
- Widening the slab (the prior may be too narrow to allow exclusion).
- Checking the prior on omega (a badly-placed omega prior can pin the
  breakpoint in an implausible region).

If a gamma is stuck at 0, the breakpoint is genuinely not supported by the
data — this is the desired behaviour.

---

## Comparing `smoothbp_ss` against `smoothbp` with LOO

Spike-and-slab and LOO are complementary model-selection tools:

```{r ss-vs-loo}
# Fit explicit 0-BP and 1-BP models for context
fit0 <- smoothbp(y ~ tau, deltas = list(), omega = list(), rho = list(),
                 data = dat, chains = 2, iter = 2000, warmup = 1000)
fit1 <- smoothbp(y ~ tau, deltas = list(~ 1), omega = list(~ 1), rho = list(~ 1),
                 data = dat, chains = 2, iter = 2000, warmup = 1000)

# LOO comparison: fixed-dimension vs spike-and-slab
loo::loo_compare(loo(fit1), loo(fit_ss))
#>       elpd_diff se_diff
#> fit1    0.0       0.0
#> fit_ss -0.2       0.4   # Both models perform identically

# PIP from the SS model agrees:
pip(fit_ss)["gamma_delta2_(Intercept)",]
#> 0.98
```

Both approaches correctly identify that one breakpoint is present.

---

## References

- Kuo, L. and Mallick, B. (1998). Variable selection for regression models.
  *Sankhya B*, 60(1):65--81.
- Barbieri, M. M. and Berger, J. O. (2004). Optimal predictive model
  selection. *Annals of Statistics*, 32(3):870--897.
