---
title: "Empirical Bayes subject-wise fitting"
format:
  html:
    toc: true
    code-fold: false
    embed-resources: true
    self-contained-math: true
vignette: >
  %\VignetteIndexEntry{Empirical Bayes subject-wise fitting}
  %\VignetteEngine{quarto::html}
  %\VignetteEncoding{UTF-8}
---

```{r setup, include = FALSE}
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  fig.width = 7,
  fig.height = 4.8,
  message = FALSE,
  warning = FALSE
)
```

This document demonstrates `ctEmpiricalBayesFit()` for a simple univariate
continuous time process. The data contain 50 subjects, each with 80 equally
spaced observations. The generating model varies across subjects in:

* `T0MEANS`: subject-specific initial latent level
* `DRIFT`: subject-specific autoregressive continuous time drift
* `DIFFUSION`: subject-specific process noise scale
* `MANIFESTMEANS`: subject-specific measurement intercept

## Simulate data

```{r simulate-data}
library(ctsem)
library(ggplot2)

set.seed(49)

n_subjects <- 20
n_obs <- 80

raw_param_names <- c("t0m", "drift", "diffusion", "mmean")
raw_cor <- matrix(c(
   1.00,  0.45, -0.25,  0.35,
   0.45,  1.00, -0.55,  0.25,
  -0.25, -0.55,  1.00, -0.20,
   0.35,  0.25, -0.20,  1.00
), nrow = 4, byrow = TRUE,
dimnames = list(raw_param_names, raw_param_names))
raw_sd <- c(t0m = .7, drift = .55, diffusion = .35, mmean = 2.45)
raw_mean <- c(t0m = 0, drift = -.4, diffusion = log(exp(.45) - 1), mmean = 0)
raw_cov <- diag(raw_sd) %*% raw_cor %*% diag(raw_sd)
raw_truth_mat <- MASS::mvrnorm(
  n = n_subjects,
  mu = raw_mean,
  Sigma = raw_cov)
colnames(raw_truth_mat) <- raw_param_names
raw_truth <- data.frame(id = seq_len(n_subjects), raw_truth_mat,
  check.names = FALSE)

softplus <- function(x) ifelse(x > 30, x, log1p(exp(x)))
truth <- data.frame(
  id = seq_len(n_subjects),
  t0m = raw_truth$t0m,
  drift = -softplus(-raw_truth$drift),
  diffusion = softplus(raw_truth$diffusion),
  mmean = raw_truth$mmean
)

datalist <- vector("list", n_subjects)
for(i in seq_len(n_subjects)){
  gm <- ctModel(
    silent = TRUE,
    Tpoints = n_obs,
    latentNames = "eta",
    manifestNames = "Y",
    LAMBDA = matrix(1),
    T0MEANS = matrix(truth$t0m[i], 1, 1),
    DRIFT = matrix(truth$drift[i], 1, 1),
    DIFFUSION = matrix(truth$diffusion[i], 1, 1),
    CINT = matrix(0),
    T0VAR = matrix(0),
    MANIFESTMEANS = matrix(truth$mmean[i], 1, 1),
    MANIFESTVAR = matrix(.35)
  )

  dat_i <- data.frame(ctGenerate(
    ctmodelobj = gm,
    n.subjects = 1,
    burnin = 0,
    dtmean = .1,
    logdtsd = 0,
    wide = FALSE))
  dat_i$id <- i
  datalist[[i]] <- dat_i
}

dat <- do.call(rbind, datalist)
dat <- dat[, c("id", "time", "Y")]
```

```{r plot-data}
ggplot(dat[dat$id <= 4, ], aes(time, Y, group = id, colour = factor(id))) +
  geom_line(linewidth = .35) + theme_bw()+
  labs(colour = "subject")
```

## Specify the fitted model

The same model object is used for the empirical Bayes and random-effects
workflows. Random effects are requested on the four generating parameters. The
empirical Bayes function internally removes those random effects for the
per-subject fits, because each subject fit contains only one subject.

```{r model}
fit_model <- ctModel(
  type = "ct",
  silent = TRUE,
  latentNames = "eta",
  manifestNames = "Y",
  LAMBDA = matrix(1),
  T0MEANS = matrix("t0m||TRUE", 1, 1),
  DRIFT = matrix("drift||TRUE", 1, 1),
  DIFFUSION = matrix("diffusion||TRUE", 1, 1),
  MANIFESTMEANS = matrix("mmean||TRUE", 1, 1),
  MANIFESTVAR = matrix("merror||FALSE", 1, 1)
)
```

## Empirical Bayes subject-wise fits

`ctEmpiricalBayesFit()` uses two subject-wise passes by default. First, it fits
the model once per subject using the model prior. The raw estimates from these
first-pass fits define the marginal empirical Bayes prior. Second, it rewrites
the single-subject model transforms so raw `normal(0, 1)` parameters imply that
empirical location and scale, then fits every subject again. The final `$fits`
element contains the final EB-prior fits; `$initialfits` contains the
model-prior fits, and `$passfits` contains every pass.

The transform adjustment is on the unconstrained raw scale. If the original
model transforms a raw parameter $q$ to the model parameter $\theta$ as

$$
\theta = f(q),
$$

then the EB-adjusted model uses a new local raw parameter $z \sim N(0, 1)$ and
sets

$$
\theta = f(\mu_q + \sigma_q z),
$$

where $\mu_q$ and $\sigma_q$ are the empirical mean and standard deviation of
the subject-level raw estimates. For example, a positive parameter with
`log1p_exp(param)` becomes `log1p_exp(mu + sigma * param)`.

If `Npasses > 2`, each later pass maps the local EB raw estimates back to the
original raw scale before recomputing $\mu_q$ and $\sigma_q$. This gives an
iterated marginal EB prior, but it still does not encode the empirical
covariance among raw parameters.

By default, the EB prior construction is robust to extreme first-pass raw
estimates: raw values are winsorized using quantile and MAD bounds, then the
prior location and scale are computed with the median and MAD. This protects the
second pass from a single subject fit with a very unusual optimum. Set
`ebRobust = FALSE`, or adjust `ebOutlierQuantiles`, `ebOutlierMAD`, and
`ebWinsorize`, if you want different behavior.

For optimized fits, the first pass uses `optimcontrol$estonly = TRUE`, because
only point estimates are needed to build the EB prior. The stochastic optimizer
is disabled by default for both passes; pass `optimcontrol =
list(stochastic = TRUE)` if you want to override this.

```{r eb-fit}
cores=2

eb_fit <- ctEmpiricalBayesFit(
  datalong = dat,
  model = fit_model,
  priors = TRUE,
  optimize = TRUE,
  cores = cores,
  Npasses = 2
)

eb_summary <- summary(eb_fit, use = "rawest", sdscale = "unit")

eb_summary$initialpopmeans
eb_summary$outliers$initial
eb_summary$popmeans
eb_summary$correlations$final
```

The summary is on the transformed parameter scale, matching the scale users
usually inspect in `summary(ctFit(...))`. The EB covariance is summarized for
inspection, but it is not encoded in the adjusted model. `eb_fit$adjustedmodel`
is the random-effect-free single-subject model used for the final pass, with
transforms based on marginal EB means and standard deviations only. The full
subject fits and raw estimate matrices remain on `eb_fit`; `summary(eb_fit)`
only returns compact summaries.

```{r eb-adjusted-model}
eb_fit$adjustedmodel$pars[
  eb_fit$adjustedmodel$pars$param %in% c("t0m", "drift", "diffusion", "mmean"),
  c("matrix", "param", "transform", "indvarying", "sdscale")]
```

## Standard random-effects fit

For comparison, fit the same model as a conventional hierarchical
random-effects model. This fit is independent of the empirical Bayes fits.

```{r re-fit}
re_fit <- ctFit(
  datalong = dat,
  model = fit_model,
  priors = TRUE,
  cores = cores
)
```

## Compare subject-level recovery

The empirical Bayes subject estimates below are separate per-subject estimates
from the second pass, using the empirical prior learned from the first pass.
The random-effects estimates are posterior subject parameters from the
hierarchical fit.

```{r comparison-helpers}
extract_subject_point <- function(fit){
  cp <- ctSummaryMatrices(fit)
  c(
    t0m = cp$T0MEANS["eta", 1],
    drift = cp$DRIFT["eta", "eta"],
    diffusion = cp$DIFFUSION["eta", "eta"],
    mmean = cp$MANIFESTMEANS["Y", 1]
  )
}

initial_subject <- do.call(rbind, lapply(eb_fit$initialfits,
  extract_subject_point))
eb_subject <- do.call(rbind, lapply(eb_fit$fits, extract_subject_point))

re_subject <- ctSubjectPars(re_fit, pointest = TRUE)[1, ,
  c("t0m", "drift", "diffusion", "mmean")]

truth_mat <- as.matrix(truth[, c("t0m", "drift", "diffusion", "mmean")])

recovery_summary <- function(est, truth){
  data.frame(
    param = colnames(truth),
    correlation = diag(stats::cor(est, truth)),
    rmse = sqrt(colMeans((est - truth)^2)),
    estimate_sd = apply(est, 2, sd),
    true_sd = apply(truth, 2, sd),
    row.names = NULL
  )
}
```

## Initial and final EB subject recovery

The first EB pass uses the model prior separately for each subject. The final
EB pass uses the empirical marginal raw prior estimated from the preceding
pass. This comparison isolates the effect of the EB prior update before adding
the random-effects model to the comparison.

```{r initial-final-eb-recovery}
eb_pass_recovery <- rbind(
  cbind(method = "Initial model-prior fits",
    recovery_summary(initial_subject, truth_mat)),
  cbind(method = "Final EB-prior fits",
    recovery_summary(eb_subject, truth_mat))
)

knitr::kable(eb_pass_recovery, digits = 3)
```

```{r initial-final-eb-plot}
eb_pass_plot_data <- rbind(
  data.frame(method = "Initial model-prior fits", id = truth$id,
    param = rep(colnames(truth_mat), each = n_subjects),
    true = as.vector(truth_mat),
    estimate = as.vector(initial_subject)),
  data.frame(method = "Final EB-prior fits", id = truth$id,
    param = rep(colnames(truth_mat), each = n_subjects),
    true = as.vector(truth_mat),
    estimate = as.vector(eb_subject))
)

ggplot(eb_pass_plot_data, aes(true, estimate, colour = method)) +
  geom_abline(slope = 1, intercept = 0, linewidth = .3) +
  geom_point(alpha = .55, size = 1.4) +
  facet_wrap(~ param, scales = "free") +
  labs(x = "Generating value", y = "Estimated subject value", colour = NULL)
```

## Correlation Recovery

There are two distinct correlation targets. The population raw random-effects
correlation is on the pre-transformation scale, so it should be compared with
the generating raw parameter correlations and with raw estimates. Subject-level
parameter correlations are on the actual transformed parameter scale, so they
should be compared with correlations among the generated `truth` values.

```{r correlation-recovery}
lower_cor_table <- function(reference, estimates){
  lower <- lower.tri(reference)
  pair_index <- which(lower, arr.ind = TRUE)
  out <- data.frame(
    pair = paste(rownames(reference)[pair_index[, 1]],
      colnames(reference)[pair_index[, 2]], sep = "__"),
    truth = as.vector(reference[lower]),
    check.names = FALSE)
  for(nm in names(estimates)){
    out[[nm]] <- as.vector(estimates[[nm]][lower])
  }
  out
}

true_raw_cor <- stats::cor(as.matrix(raw_truth[, raw_param_names]))
initial_eb_raw_cor <- stats::cor(eb_fit$initialraw[, raw_param_names])
final_eb_raw_cor <- stats::cor(
  eb_fit$passoriginalraw[[length(eb_fit$passoriginalraw)]][, raw_param_names])

re_raw_names <- ctsem:::getparnames(re_fit, reonly = TRUE)
re_rawpopcorr <- re_fit$stanfit$transformedparsfull$rawpopcorr[1, , ]
dimnames(re_rawpopcorr) <- list(re_raw_names, re_raw_names)
re_rawpopcorr <- re_rawpopcorr[raw_param_names, raw_param_names]

raw_cor_recovery <- lower_cor_table(true_raw_cor, list(
  "Initial EB raw estimates" = initial_eb_raw_cor,
  "Final EB raw estimates" = final_eb_raw_cor,
  "Random-effects population raw" = re_rawpopcorr
))

knitr::kable(raw_cor_recovery, digits = 3)
```

```{r subject-correlation-recovery}
true_subject_cor <- stats::cor(truth_mat)
initial_subject_cor <- stats::cor(initial_subject)
eb_subject_cor <- stats::cor(eb_subject)
re_subject_cor <- stats::cor(re_subject)

subject_cor_recovery <- lower_cor_table(true_subject_cor, list(
  "Initial EB subject values" = initial_subject_cor,
  "Final EB subject values" = eb_subject_cor,
  "Random-effects subject values" = re_subject_cor
))

knitr::kable(subject_cor_recovery, digits = 3)
```

## Final EB and random-effects subject recovery

This comparison uses the final EB subject fits and the hierarchical
random-effects subject estimates on the transformed parameter scale.

```{r final-eb-random-effects-recovery}
recovery <- rbind(
  cbind(method = "EB subject fits", recovery_summary(eb_subject, truth_mat)),
  cbind(method = "Random effects", recovery_summary(re_subject, truth_mat))
)

knitr::kable(recovery, digits = 3)
```

Correlation and RMSE answer different questions. Correlation is mostly about
whether subjects are ranked correctly, while RMSE also penalizes bias and
miscalibrated spread. The EB fits can therefore show better correlation if the
empirical prior stabilizes noisy individual estimates, but worse RMSE if the
subject estimates are shifted or overshrunk relative to the generating values.
The random-effects fit can have lower RMSE because it estimates the population
distribution and covariance jointly with the subject deviations.

```{r comparison-plot}
plot_data <- rbind(
  data.frame(method = "EB subject fits", id = truth$id,
    param = rep(colnames(truth_mat), each = n_subjects),
    true = as.vector(truth_mat),
    estimate = as.vector(eb_subject)),
  data.frame(method = "Random effects", id = truth$id,
    param = rep(colnames(truth_mat), each = n_subjects),
    true = as.vector(truth_mat),
    estimate = as.vector(re_subject))
)

ggplot(plot_data, aes(true, estimate, colour = method)) +
  geom_abline(slope = 1, intercept = 0, linewidth = .3) +
  geom_point(alpha = .55, size = 1.4) +
  facet_wrap(~ param, scales = "free") +
  labs(x = "Generating value", y = "Estimated subject value", colour = NULL)
```