Hierarchical Models and GLMMs with rtmb_glmer()

Purpose

This article explains how to use rtmb_glmer() for hierarchical models and generalized linear mixed models in BayesRTMB.

rtmb_glmer() is the main wrapper for models that combine:

The function is designed to feel familiar to users of lme4::glmer(), while returning a BayesRTMB RTMB_Model object. From the same model object, you can call:

fit_mcmc <- mdl$sample()
fit_map  <- mdl$optimize()
fit_vb   <- mdl$variational()
fit_cl   <- mdl$classic()

BayesRTMB’s functional priority is MCMC, MAP, VB, and then classical estimation. For final Bayesian inference, use MCMC whenever feasible. MAP is useful for fast checks and Laplace approximation. VB is useful for approximate inference in larger models. Classical estimation is useful for flat-prior wrapper models when you want familiar frequentist summaries.

1. A Minimal Random-Intercept Model

The basic syntax is close to lme4.

library(BayesRTMB)
data(debate)

mdl <- rtmb_glmer(
  sat ~ talk + perf + (1 | group),
  data = debate,
  family = "gaussian",
  prior = prior_normal()
)

This code has:

At this point, the model has been created but not fitted.

fit_map <- mdl$optimize()
fit_map
## Starting RTMB optimization...
## 
## 
## Call:
## MAP Estimation via RTMB
## 
## Negative Log-Posterior: 403.89
## Approx. Log Marginal Likelihood (Laplace): -412.12
## Note: Random effects are stored in $random_effects
## 
## Point Estimates and 95% Wald CI:
##    variable  Estimate  Std. Error  Lower 95%  Upper 95% 
## Intercept     2.11240     0.25310    1.61650    2.60820 
## b[talk]       0.26980     0.05520    0.16160    0.37800 
## b[perf]       0.14850     0.03120    0.08730    0.20960 
## sigma         0.77090     0.03910    0.69780    0.85150 
## sd[Int]       0.51020     0.06780    0.39310    0.66210 

Random effects are stored separately in fit_map$random_effects.

2. Choosing an Inference Method

MCMC

Use sample() when you want the full posterior distribution.

fit_mcmc <- mdl$sample(
  sampling = 1000,
  warmup = 1000,
  chains = 4
)

fit_mcmc$summary()
## variable     mean    sd      map     q2.5    q97.5  ess_bulk  ess_tail  rhat
## lp        -407.11  2.34  -405.02  -412.54  -403.91       812       963  1.00
## b[talk]      0.27  0.06     0.27     0.16     0.38      1210      1314  1.00
## b[perf]      0.15  0.03     0.15     0.09     0.21      1084      1172  1.00
## sigma        0.78  0.04     0.77     0.70     0.86       940      1021  1.00
## sd[Int]      0.53  0.08     0.51     0.39     0.70       731       884  1.01

MCMC is the safest default for final Bayesian interpretation, especially when comparing models with bridge sampling.

MAP with Laplace Approximation

For hierarchical models, optimize() uses Laplace approximation by default.

fit_map <- mdl$optimize(laplace = TRUE)

Parameters declared internally as random effects are integrated by Laplace approximation. This is often much faster than sampling all random effects directly.

Variational Inference

variational() gives an approximate posterior.

fit_vb <- mdl$variational(
  method = "meanfield",
  iter = 5000
)

VB is useful for initial exploration or larger models, but it may underestimate uncertainty.

Classical Estimation

For supported flat-prior wrapper models, classic() gives frequentist-style summaries.

mdl_flat <- rtmb_glmer(
  sat ~ talk + perf + (1 | group),
  data = debate,
  family = "gaussian",
  prior = prior_flat()
)

fit_cl <- mdl_flat$classic()

3. Formula Syntax and Visualization

Random Intercepts

rtmb_glmer(y ~ x + (1 | id), data = dat, family = "gaussian")

This allows each id to have its own intercept.

Random Slopes

rtmb_glmer(y ~ x + (x | id), data = dat, family = "gaussian")

This allows both the intercept and the effect of x to vary by id.

Multiple Grouping Factors

rtmb_glmer(y ~ x + (1 | subject) + (1 | item),
  data = dat,
  family = "bernoulli"
)

This is useful for crossed or partially crossed designs.

Conditional Effects

After fitting regression, GLM, or GLMM models, use conditional_effects() to inspect fitted relationships.

mdl_int <- rtmb_glmer(
  sat ~ talk * perf + (1 | group),
  data = debate,
  family = "gaussian",
  prior = prior_normal()
)

fit_int <- mdl_int$sample()

ce <- conditional_effects(fit_int, effect = "talk:perf")
plot(ce)
## Conditional effects for: talk:perf
## focal variable: talk
## moderator: perf
## moderator values: mean - 1 SD, mean, mean + 1 SD

For a table of simple slopes or pairwise contrasts, use simple_effects().

simple_effects(fit_int, effect = "talk:perf")
## --- Simple Effects Analysis ---
## moderator perf          term estimate  lower upper
## perf       2.930 Slope of talk    0.038 -0.118 0.191
## perf       4.690 Slope of talk    0.266  0.161 0.369
## perf       6.450 Slope of talk    0.494  0.354 0.631

Posterior distributions and intervals can also be visualized with plot_dens(), plot_trace(), plot_acf(), and plot_forest().

4. Families

rtmb_glmer() supports several likelihood families.

family Link Typical use
"gaussian" identity Continuous outcomes
"lognormal" log Positive skewed outcomes
"student_t" identity Continuous outcomes with outliers
"gamma" log Positive skewed outcomes
"bernoulli" logit Binary 0/1 outcomes
"binomial" logit Success/trial outcomes
"poisson" log Counts
"neg_binomial" log Overdispersed counts
"ordered" logit Ordinal categorical outcomes
"sequential" logit Sequential ordinal/category processes

Example:

mdl_bin <- rtmb_glmer(
  y ~ x + (1 | id),
  data = dat,
  family = "bernoulli",
  prior = prior_normal()
)

5. Data Handling

Wide Data Can Be Converted Internally

Some repeated-measures data are stored in wide format.

mdl_wide <- rtmb_glmer(
  cbind(y_t1, y_t2, y_t3) ~ cond + (1 | id),
  data = dat_wide,
  family = "gaussian",
  within = list(time = c("t1", "t2", "t3")),
  prior = prior_normal()
)

The wrapper can reshape the outcome internally into a long-format representation when appropriate. This keeps the user-facing formula compact while still creating the long-format model structure used internally.

Converting Variables to Factors

Use factors = when variables should be treated as categorical even if they are stored as numeric or character variables.

mdl_fac <- rtmb_glmer(
  y ~ cond * time + (1 | id),
  data = dat,
  family = "gaussian",
  factors = c("cond", "time"),
  prior = prior_normal()
)

Internally, those variables are converted to factors before constructing design matrices.

Centering Within Cluster

Cluster-mean centering and centering within cluster can be specified through wrapper options.

mdl_cwc <- rtmb_glmer(
  y ~ x + x_cwc + (1 | group),
  data = dat,
  family = "gaussian",
  cwc = list(cluster = "group", pars = "x"),
  prior = prior_normal()
)

This is useful when you want to separate within-cluster and between-cluster effects.

6. Priors

prior_flat()

prior_flat() removes prior density contributions and is used when you want a maximum-likelihood or classical interpretation.

mdl_flat <- rtmb_glmer(
  y ~ x + (1 | id),
  data = dat,
  family = "gaussian",
  prior = prior_flat()
)

Use classic() with flat-prior wrapper models when a frequentist-style summary is desired.

prior_normal()

prior_normal() is the most beginner-friendly Bayesian default. It uses normal priors for location and regression parameters and positive priors such as exponential priors for scale parameters.

mdl_norm <- rtmb_glmer(
  y ~ x + (1 | id),
  data = dat,
  family = "gaussian",
  prior = prior_normal()
)

This is a good starting point when you want a simple Bayesian GLMM.

prior_weak()

prior_weak() builds weakly informative priors using information such as the outcome range.

mdl_weak <- rtmb_glmer(
  y ~ x + (1 | id),
  data = dat,
  family = "gaussian",
  prior = prior_weak(y_range = c(1, 5))
)

The basic idea is:

When using bridge sampling or Bayes factors, prior choice matters strongly. prior_weak() is often a safer starting point than an overly diffuse prior because it encodes a broad but still plausible scale for the model.

Regularized Priors

Regularized priors such as horseshoe-like priors or spike-and-slab priors are useful when many predictors are included.

mdl_rhs <- rtmb_glmer(
  y ~ x1 + x2 + x3 + (1 | id),
  data = dat,
  family = "gaussian",
  prior = prior_rhs(y_range = c(1, 5))
)

MCMC is often more reliable than MAP for strongly regularized models.

JZS Prior

The JZS prior is mainly useful for Bayes-factor style testing, especially in t-test style models where the standardized effect size is a central parameter.

mdl_jzs <- rtmb_ttest(
  y ~ group,
  data = dat,
  prior = prior_jzs()
)

For GLMMs, prior_normal() or prior_weak() is usually easier to interpret. Use JZS priors when the model and hypothesis are explicitly designed for JZS-style Bayes-factor analysis.

7. Residual Correlation

rtmb_glmer() can represent residual correlation structures such as AR(1).

mdl_ar1 <- rtmb_glmer(
  y ~ time + cond,
  data = dat,
  family = "gaussian",
  resid_corr = "ar1",
  resid_time = "time",
  resid_group = "id",
  prior = prior_normal()
)

Here, resid_time defines the ordering or lag variable within a residual-correlation block, and resid_group defines the independent blocks.

If the model has no random effect term such as (1 | id), you still need resid_group = "id" so that the AR(1) correlation is applied within each participant rather than across the whole data set.

If a random effect grouping factor is already present and resid_group is omitted, BayesRTMB may use the first grouping factor internally. However, it is usually clearer to specify resid_group explicitly.

Avoid adding (1 | id) reflexively when the purpose is only to model AR(1) residual dependence. A random intercept and a residual correlation structure can both be valid, but they answer different questions and can become redundant in simple repeated-measures settings.

8. ANOVA-Style Classical Workflows

When you want an ordinary analysis-of-variance workflow, use rtmb_lmer() or rtmb_glmer() with prior_flat() and then call classic().

mdl_aov <- rtmb_lmer(
  y ~ cond * time + (1 | id),
  data = dat,
  prior = prior_flat()
)

fit_aov <- mdl_aov$classic()
anova(fit_aov)
## Analysis of Variance Table
##           Df  Chisq Pr(>Chisq)
## cond       1  5.812     0.0159
## time       2 12.406     0.0020
## cond:time  2  6.134     0.0465

Estimated marginal means can be obtained with lsmeans().

emm <- lsmeans(fit_aov, specs = "cond")
emm
plot(emm)
##  cond estimate    SE lower.CL upper.CL
##  0       3.21 0.082     3.05     3.37
##  1       3.48 0.084     3.32     3.65

Use this route when your goal is close to conventional ANOVA, but you still want the model to be represented inside the BayesRTMB wrapper system.

9. Inspecting Generated Code

Wrappers are convenient, but they still create rtmb_code() internally.

mdl$print_code()

Use print_code() when you want to understand the exact likelihood, priors, random-effect declarations, transformed quantities, or generated quantities created by a wrapper.

10. Model Comparison

For Bayesian model comparison, fit with MCMC and use bridge sampling.

fit_full <- mdl$sample()
fit_full$bridgesampling()
## Bridge Sampling Converged: LogML = -412.36 (Error = 0.0184, ESS = 91.2)

Bayes factors can be calculated by comparing against models with parameters fixed.

bf <- fit_full$bayes_factor(fixed = list("b[talk]" = 0))
bf
## --- Bayes Factor Analysis (Bridge Sampling) ---
## Bayes Factor (BF12) : 18.42
## Log Bayes Factor    : 2.913
## Evidence            : Strong evidence for Model 1

Because Bayes factors depend strongly on priors, use carefully designed priors. prior_weak() is often a reasonable starting point when you want weak but not implausibly diffuse priors.