Introduction to BayesRTMB

What is BayesRTMB?

BayesRTMB is a package for specifying and estimating statistical models in R, using RTMB as the computational engine.

In BayesRTMB, standard analyses can start from wrapper functions such as rtmb_lm() and rtmb_glmer(). When needed, you can also use rtmb_code() to write custom models with a Stan-like feel.

The basic idea of BayesRTMB is that MCMC, MAP estimation, variational inference, and frequentist analysis can all be called from a single model object.

fit_mcmc <- mdl$sample()
fit_map <- mdl$optimize()
fit_vb <- mdl$variational()
fit_freq <- mdl$classic()

This page introduces where BayesRTMB fits and the overall picture that is useful to know when first using the package. More detailed model-writing rules and individual analysis examples are covered in the other vignettes.

What BayesRTMB Can Do

The main features of BayesRTMB are as follows.

Two Entry Points

BayesRTMB has two broad entry points.

The first is a set of wrapper functions for standard analyses. Usually, this is the easiest place to start.

data(debate)

mdl <- rtmb_lm(sat ~ talk + perf, data = debate)
fit <- mdl$optimize()
fit

The second is rtmb_code(), which is used when you want to write the model yourself. This is useful when a model is difficult to express with wrappers or when you want to try a new model.

dat <- list(
  Y = debate$sat,
  X = data.matrix(debate[c("talk", "perf")])
)

code <- rtmb_code(
  setup = {
    N <- length(Y)
    P <- ncol(X)
  },
  parameters = {
    alpha = Dim()
    beta = Dim(P)
    sigma = Dim(lower = 0)
  },
  transform = {
    mu <- alpha + X %*% beta
  },
  model = {
    Y ~ normal(mu, sigma)
    alpha ~ normal(0, 10)
    beta ~ normal(0, 10)
    sigma ~ exponential(1)
  }
)

mdl <- rtmb_model(dat, code)

In the model block, you can use sampling syntax like this.

Y ~ normal(mu, sigma)
beta ~ normal(0, 10)
sigma ~ exponential(1)

Internally, this is converted into code that adds log densities.

Starting from Wrapper Functions

First, let us look at an example using a wrapper function. Here we use the debate data and consider a linear regression model in which satisfaction sat is explained by talk and perf.

data(debate)

mdl_lm <- rtmb_lm(
  sat ~ talk + perf,
  data = debate
)

At this point, estimation has not yet been run. mdl_lm is an RTMB_Model object that stores the model definition and data.

To run MAP estimation, use optimize().

fit_map <- mdl_lm$optimize()
fit_map
## Starting RTMB optimization...
## 
## 
## Call:
## MAP Estimation via RTMB
## 
## Negative Log-Posterior: 395.58
## Approx. Log Marginal Likelihood (Laplace): -404.61
## 
## Point Estimates and 95% Wald CI:
##  variable  Estimate  Std. Error  Lower 95%  Upper 95% 
## Intercept   1.83366     0.21105    1.42000    2.24732 
## b[talk]     0.28694     0.05291    0.18323    0.39064 
## b[perf]     0.15632     0.02987    0.09777    0.21486 
## sigma       0.90453     0.03693    0.83497    0.97988 

If you want to estimate the model as a frequentist linear regression, use classic(). classic() is available for wrapper models constructed with a flat prior.

fit_freq <- mdl_lm$classic()
fit_freq
## Starting RTMB optimization...
## 
## 
## Call:
## Classical estimation via lm 
## 
## Log-Likelihood: -402.220, AIC: 812.440, BIC: 827.255
## 
## Point Estimates and Confidence Intervals:
##           Estimate Std. Error Lower 95% Upper 95%  df  t value     Pr    
## Intercept  1.83366    0.21212   1.41621   2.25110 297  8.64454 <.0001 ***
## b[talk]    0.28694    0.05318   0.18229   0.39159 297  5.39580 <.0001 ***
## b[perf]    0.15632    0.03002   0.09724   0.21539 297  5.20703 <.0001 ***
## sigma      0.90909    0.03730   0.83856   0.98554 297     NA

You can inspect the model code generated by a wrapper function with print_code().

mdl_lm$print_code()
## === RTMB Model Code ===
## 
## rtmb_code(
##   setup = {
##     mf <- model.frame(formula, df)
##     Y <- model.response(mf)
##     X_full <- model.matrix(formula, mf)
##     X <- X_full[, colnames(X_full) != "(Intercept)", drop = FALSE]
##     N <- length(Y)
##     K <- ncol(X)
##   }, 
##   parameters = {
##     Intercept <- Dim(1)
##     b <- Dim(K)
##     sigma <- Dim(num_sigma_groups, lower = 0)
##   }, 
##   model = {
##     # Transform
##     eta <- Intercept + X %*% b
##     # Likelihood (Data)
##     Y ~ normal(eta, sigma)
##     # Priors
##   }
## )

print_code() displays blocks such as setup, parameters, transform, and model so that users can reproduce the same model with rtmb_model().

Writing Your Own Model

When you want to write a more flexible model, use rtmb_code().

The main blocks of rtmb_code() are as follows.

For example, a regression model with simple normal and exponential priors close to prior_normal() can be written as follows.

dat <- list(
  Y = as.numeric(debate$sat),
  X = data.matrix(debate[c("talk", "perf", "skill")])
)

code_reg <- rtmb_code(
  setup = {
    N <- length(Y)
    P <- ncol(X)
  },
  parameters = {
    alpha = Dim()
    beta = Dim(P)
    sigma = Dim(lower = 0)
  },
  transform = {
    mu <- alpha + X %*% beta
  },
  model = {
    Y ~ normal(mu, sigma)
    alpha ~ normal(0, 10)
    beta ~ normal(0, 2.5)
    sigma ~ exponential(1)
  },
  generate = {
    Y_rep <- rnorm(length(Y), mu, sigma)
  }
)

mdl_reg <- rtmb_model(dat, code_reg)

Writing preprocessing in setup makes the parameters and model blocks easier to read. It also makes the processing needed to reproduce the model visible when print_code() is displayed.

Choosing an Inference Method

BayesRTMB is built primarily around Bayesian estimation. For that reason, this section presents the methods in the order MCMC, MAP estimation, variational inference, and frequentist analysis.

Method Main purpose
$sample() Run MCMC sampling with NUTS
$optimize() Run fast MAP estimation or maximum likelihood estimation
$variational() Approximate the posterior distribution with ADVI
$classic() Treat a flat-prior wrapper model as a frequentist analysis

MCMC

sample() runs MCMC sampling with NUTS. Use it when you want to inspect the full posterior distribution or when MAP estimation alone is not enough to represent uncertainty.

set.seed(1)

fit_mcmc <- mdl_lm$sample()

fit_mcmc$summary()
##  variable     mean    sd      map     q2.5    q97.5  ess_bulk  ess_tail  rhat 
## lp         -397.79  1.54  -396.75  -401.63  -395.94       859      1236  1.01
## Intercept     1.82  0.22     1.78     1.38     2.26       629      1023  1.01
## b[talk]       0.29  0.05     0.28     0.18     0.40       665      1205  1.00
## b[perf]       0.16  0.03     0.15     0.10     0.22       671       990  1.01
## sigma         0.91  0.04     0.91     0.84     1.00       934      1254  1.01

MAP Estimation

optimize() performs point estimation by maximizing the posterior distribution or likelihood. It is useful for checking a model and for initial exploration of complex models.

fit_map <- mdl_lm$optimize()
fit_map$summary()
## Starting RTMB optimization...
## 
## 
## Call:
## MAP Estimation via RTMB
## 
## Negative Log-Posterior: 395.58
## Approx. Log Marginal Likelihood (Laplace): -404.61
## 
## Point Estimates and 95% Wald CI:
##  variable  Estimate  Std. Error  Lower 95%  Upper 95% 
## Intercept   1.83366     0.21105    1.42000    2.24732 
## b[talk]     0.28694     0.05291    0.18323    0.39064 
## b[perf]     0.15632     0.02987    0.09777    0.21486 
## sigma       0.90453     0.03693    0.83497    0.97988 

With se_method = "sampling", standard errors and intervals for generated quantities can be calculated by propagating uncertainty through approximate normal samples.

fit_map <- mdl_lm$optimize(se_method = "sampling")

Variational Inference

variational() approximates the posterior distribution using ADVI. It is useful when you want rough results faster than MCMC. However, it can underestimate posterior uncertainty, so caution is needed when using it for final uncertainty assessment.

fit_vb <- mdl_lm$variational(
  method = "meanfield",
  iter = 3000,
  num_estimate = 4
)

Frequentist Analysis

classic() returns frequentist estimation results for flat-prior models constructed by wrapper functions.

fit_freq <- mdl_lm$classic()
fit_freq$summary()
## Starting RTMB optimization...
## 
## 
## Call:
## Classical estimation via lm 
## 
## Log-Likelihood: -402.220, AIC: 812.440, BIC: 827.255
## 
## Point Estimates and Confidence Intervals:
##           Estimate Std. Error Lower 95% Upper 95%  df  t value     Pr    
## Intercept  1.83366    0.21212   1.41621   2.25110 297  8.64454 <.0001 ***
## b[talk]    0.28694    0.05318   0.18229   0.39159 297  5.39580 <.0001 ***
## b[perf]    0.15632    0.03002   0.09724   0.21539 297  5.20703 <.0001 ***
## sigma      0.90909    0.03730   0.83856   0.98554 297     NA

For lm- and glm-type models, test statistics are displayed in a form close to the usual notation. For mixed models, degrees-of-freedom corrections and Laplace approximation are used when needed.

Random Effects and Laplace Approximation

In hierarchical and mixed models, parameters can be declared with random = TRUE.

Y <- debate$sat
group <- as.integer(factor(debate$group))
G <- length(unique(group))

dat_icc <- list(Y = Y, group = group, G = G)

code_icc <- rtmb_code(
  parameters = {
    mu = Dim()
    sigma = Dim(lower = 0)
    tau = Dim(lower = 0)
    r = Dim(G, random = TRUE)
  },
  model = {
    Y ~ normal(mu + r[group] * tau, sigma)
    r ~ normal(0, 1)
    tau ~ exponential(1)
    sigma ~ exponential(1)
  },
  generate = {
    icc <- tau^2 / (tau^2 + sigma^2)
  }
)

mdl_icc <- rtmb_model(dat_icc, code_icc)
fit_icc <- mdl_icc$optimize(laplace = TRUE)

In MAP estimation, laplace = TRUE is the default. Parameters declared with random = TRUE are marginalized by Laplace approximation.

With wrapper functions, a mixed model can be written as follows.

mdl_hlm <- rtmb_glmer(
  sat ~ talk + perf + (1 | group),
  data = debate,
  family = "gaussian"
)

fit_hlm <- mdl_hlm$optimize()

Model Comparison (Bridge Sampling / WAIC)

Marginal Likelihood and Bayes Factor by Bridge Sampling

For MCMC results, bridge sampling can be used to compute the marginal likelihood.

set.seed(1)

fit_mcmc$bridgesampling()

For example, the output is displayed as follows.

## Bridge Sampling Converged: LogML = -404.591 (Error = 0.0032, ESS = 604.0)

The returned value is a numeric log marginal likelihood with error and ess attributes.

Use bayes_factor() when you want to compute a Bayes factor.

bf <- fit_mcmc$bayes_factor(fixed = list("b[talk]" = 0))
bf
## --- Bayes Factor Analysis (Bridge Sampling) ---
## Bayes Factor (BF12) : 142963.4
## Log Bayes Factor    : 11.8703 (Approx. Error = 0.0040)
## Evidence            : Decisive evidence for Model 1 
## Comparison model    : Parameters fixed at list("b[talk]" = 0) 

Model Comparison by WAIC

In BayesRTMB, if you store the pointwise log likelihood for each observation in a variable named log_lik within the generate block of rtmb_code(), you can calculate the WAIC (Widely Applicable Information Criterion) using the $WAIC() method after estimation.

For example, define log_lik in the generate block of a custom model as follows:

  generate = {
    # Compute the pointwise log likelihood and assign it to log_lik (specify sum = FALSE)
    log_lik <- normal_lpdf(Y, mu, sigma, sum = FALSE)
  }

Note: When using wrapper functions (e.g., rtmb_lm()), you can specify WAIC = TRUE in the arguments to automatically generate log_lik internally.

mdl_lm <- rtmb_lm(sat ~ talk + perf, data = debate, WAIC = TRUE)
fit_mcmc <- mdl_lm$sample()

# Compute WAIC
fit_mcmc$WAIC()

The output will be displayed as follows:

## WAIC
##   WAIC (elpd scale): -400.07398
##   -2 WAIC: 800.14796
##   p_waic: 4.70921
##   lppd: -395.36477
##   nobs: 300, draws: 4000

The $WAIC() method is available for results from MCMC (MCMC_Fit), variational inference (VB_Fit), and MAP estimation (MAP_Fit) with se_method = "sampling".

Next Steps

Once you have a general picture of BayesRTMB, the next step is to read the articles that match your purpose.

  1. Quick Start
    Walk through a basic model and learn the flow of rtmb_code() and rtmb_model().

  2. Wrapper Functions
    Learn how to run standard analyses such as regression, generalized linear models, mixed models, and t tests with wrapper functions.

  3. Hierarchical Models and GLMMs
    Learn how to use rtmb_glmer() for hierarchical models, GLMMs, residual correlation, ordinal models, priors, and visualization.

  4. Writing Model Codes
    Learn how to write your own models using the setup, parameters, transform, model, and generate blocks.

  5. RTMB Internals and Inference Algorithms
    Learn about internal processing such as Laplace approximation, constrained parameters, MCMC, and variational inference.