The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
The bayesDiagnostics package provides a
comprehensive suite of diagnostic tools for Bayesian models fitted with
brms, rstan, and other popular Bayesian
modeling packages. This vignette demonstrates all major functions
organized by diagnostic category.
library(bayesDiagnostics)
library(brms)
#> Loading required package: Rcpp
#> Loading 'brms' package (version 2.23.0). Useful instructions
#> can be found by typing help('brms'). A more detailed introduction
#> to the package is available through vignette('brms_overview').
#>
#> Attaching package: 'brms'
#> The following object is masked from 'package:stats':
#>
#> ar
library(ggplot2)The package contains 18 main functions organized into 5 categories:
The mcmc_diagnostics_summary() function provides a
comprehensive overview of MCMC convergence, including R-hat, ESS, and
NUTS-specific diagnostics.
# Fit a simple model
data <- data.frame(
y = rnorm(100, mean = 5, sd = 2),
x1 = rnorm(100),
x2 = rnorm(100)
)
fit <- brm(y ~ x1 + x2, data = data, chains = 4, iter = 2000, refresh = 0)
# Run comprehensive diagnostics
diag <- mcmc_diagnostics_summary(fit, rhat_threshold = 1.01, ess_threshold = 400)
print(diag)
# Check results
diag$converged # TRUE/FALSE
diag$summary_table # Full diagnostic table
diag$divergences # Number of divergent transitionsWhat it checks: - R-hat values (should be < 1.01) - Bulk ESS and Tail ESS (should be > 400) - Divergent transitions (should be 0) - Tree depth saturation (NUTS-specific)
The effective_sample_size_diagnostics() function
provides detailed ESS analysis with visual diagnostics.
# Detailed ESS analysis
ess_diag <- effective_sample_size_diagnostics(
model = fit,
parameters = c("b_x1", "b_x2", "sigma"),
min_ess = 400,
by_chain = TRUE,
plot = TRUE
)
print(ess_diag)
plot(ess_diag)
# Access specific components
ess_diag$ess_summary # Summary statistics
ess_diag$bulk_ess # Bulk ESS per parameter
ess_diag$tail_ess # Tail ESS per parameter
ess_diag$by_chain_ess # Per-chain ESS breakdown
ess_diag$problematic_params # Parameters with low ESS
ess_diag$recommendations # Actionable suggestionsInterpretation: - Bulk ESS: Measures precision of posterior mean/median estimates - Tail ESS: Measures precision of credible interval bounds - Per-chain ESS: Identifies which specific chains have issues
For hierarchical/multilevel models, use
hierarchical_convergence() to check group-level and
population-level parameters separately.
# Fit hierarchical model
data_hier <- data.frame(
y = rnorm(200),
x = rnorm(200),
group = rep(1:10, each = 20)
)
fit_hier <- brm(
y ~ x + (1 + x | group),
data = data_hier,
chains = 4,
refresh = 0
)
# Check hierarchical convergence
hier_conv <- hierarchical_convergence(
model = fit_hier,
group_vars = "group",
plot = TRUE
)
print(hier_conv)
plot(hier_conv)What it provides: - Group-level vs. population-level convergence breakdown - Variance component diagnostics - Shrinkage assessment - Visual comparison of group-level parameters
The posterior_predictive_check() function is the
workhorse for model validation.
# Basic PPC with multiple test statistics
ppc_result <- posterior_predictive_check(
model = fit,
observed_data = data$y,
n_samples = 1000,
test_statistics = c("mean", "sd", "median", "min", "max", "skewness", "kurtosis"),
plot = TRUE
)
print(ppc_result)
plot(ppc_result)
# Access results
ppc_result$observed_stats # Observed test statistics
ppc_result$replicated_stats # Posterior predictive statistics
ppc_result$p_values # Bayesian p-values (should be ~0.5)Interpretation Guide: - p-value ≈ 0.5: Good fit (model captures the statistic well) - p-value < 0.05 or > 0.95: Model misspecification - p-value near 0: Model underestimates the statistic - p-value near 1: Model overestimates the statistic
For quick diagnostic checks across multiple statistics, use
automated_ppc():
# Automated checks with flagging
auto_ppc <- automated_ppc(
model = fit,
observed_data = data$y,
n_samples = 1000,
p_value_threshold = 0.05
)
print(auto_ppc)
# Check for issues
auto_ppc$diagnostics # Table with all statistics and flags
auto_ppc$flags # Text warnings for problematic statisticsWhen to use: - Quick model validation - Screening for gross misspecification - Automated reporting workflows
Create publication-quality PPC visualizations:
# Density overlay
p1 <- graphical_ppc(fit, data$y, type = "density", n_draws = 50)
print(p1)
# Prediction intervals
p2 <- graphical_ppc(fit, data$y, type = "intervals", n_draws = 50)
print(p2)
# Ribbon plot (useful for ordered/time-series data)
p3 <- graphical_ppc(fit, data$y, type = "ribbon", n_draws = 50)
print(p3)The ppc_crossvalidation() function performs
Leave-One-Out Probability Integral Transform (LOO-PIT) checks:
# LOO-PIT check
loo_ppc <- ppc_crossvalidation(
model = fit,
observed_y = data$y,
n_draws = NULL # Use all draws for accuracy
)
print(loo_ppc)
plot(loo_ppc)
# Inspect results
loo_ppc$pit_values # Should be ~Uniform(0,1) if well-calibrated
loo_ppc$loo_result # LOO object
loo_ppc$weighted # Whether weighted PIT was usedInterpretation: - Uniform PIT distribution: Model is well-calibrated - U-shaped: Over-dispersed predictions - Inverse-U: Under-dispersed predictions - Skewed: Systematic bias in predictions
For custom test statistics, use the flexible
bayesian_p_values() utility:
# Generate posterior predictive samples
yrep <- posterior_predict(fit, ndraws = 1000)
# Define custom test statistic
custom_stat <- function(x) max(x) - min(x) # Range
# Calculate Bayesian p-value
result <- bayesian_p_values(
yrep = yrep,
y = data$y,
statistic = custom_stat
)
result$observed_value # Observed range
result$replicated_values # Posterior predictive ranges
result$p_value # P(replicated ≥ observed)The prior_elicitation_helper() translates expert
knowledge into statistical priors:
# Define expert beliefs
expert_input <- list(
parameter_name = "treatment_effect",
plausible_range = c(-0.5, 1.5), # Plausible values
most_likely_value = 0.3, # Best guess
confidence = 0.8 # How confident (0-1)
)
# Get prior recommendations
prior_rec <- prior_elicitation_helper(
expert_beliefs = expert_input,
parameter_type = "continuous",
method = "quantile",
data_sample = rnorm(100, 0.3, 0.5), # Optional: existing data
visualize = TRUE
)
print(prior_rec)
# Use recommended prior
prior_rec$recommended_prior # brms::prior object
prior_rec$alternatives # Alternative priors for sensitivity
prior_rec$sensitivity_note # GuidanceParameter types: - "continuous":
Normal, Student-t, log-normal - "discrete": Poisson,
negative binomial - "proportion": Beta distribution
The prior_sensitivity() function assesses robustness to
prior choice:
# Define alternative priors
prior_grid <- list(
weak = set_prior("normal(0, 10)", class = "b"),
moderate = set_prior("normal(0, 2)", class = "b"),
strong = set_prior("normal(0, 0.5)", class = "b")
)
# Run sensitivity analysis
sens_result <- prior_sensitivity(
model = fit,
parameters = c("b_x1", "b_x2"),
prior_grid = prior_grid,
comparison_metric = "KL", # or "Wasserstein", "overlap"
plot = TRUE,
n_draws = 2000
)
print(sens_result)
plot(sens_result)
# Check sensitivity
sens_result$sensitivity_metrics # How much posteriors differMetrics: - KL divergence: Information-theoretic difference - Wasserstein distance: L1 distance between distributions - Overlap coefficient: Proportion of overlapping density
Interpretation: - Low sensitivity: Conclusions robust to prior choice ✓ - High sensitivity: Data is weak; prior strongly influences results
For comprehensive multi-dimensional sensitivity assessment:
robust_result <- prior_robustness(
model = fit,
prior_specifications = prior_grid,
parameters = c("b_x1", "b_x2"),
perturbation_direction = "expand", # or "contract", "shift"
dimensions = c(0.5, 1, 2, 4), # Scaling factors
comparison_metric = "KL",
plot = TRUE
)
print(robust_result)
# Check robustness
robust_result$robustness_index # Composite score (higher = more robust)
robust_result$concerning_parameters # Parameters with low robustness
robust_result$recommendations # What to doThe model_comparison_suite() compares multiple models
using information criteria:
# Fit competing models
fit1 <- brm(y ~ x1, data = data, refresh = 0)
fit2 <- brm(y ~ x1 + x2, data = data, refresh = 0)
fit3 <- brm(y ~ x1 * x2, data = data, refresh = 0)
# Compare models
comparison <- model_comparison_suite(
Model_1 = fit1,
Model_2 = fit2,
Model_3 = fit3,
criterion = "all", # LOO, WAIC, Bayes R²
plot = TRUE,
detailed = TRUE
)
print(comparison)
plot(comparison)
# Results
comparison$comparison_table # All metrics
comparison$ic_differences # ΔLOO, ΔWAIC, model weights
comparison$plots # VisualizationMetrics explained: - LOO-ELPD: Leave-one-out expected log predictive density (higher is better) - WAIC: Widely Applicable Information Criterion (lower is better) - Bayes R²: Bayesian coefficient of determination - Model weights: Probability each model is best
For direct model comparison via Bayes Factors:
# Compare two models
bf_result <- bayes_factor_comparison(
Model_A = fit1,
Model_B = fit2,
method = "bridge_sampling", # or "waic"
repetitions = 5,
silent = TRUE
)
print(bf_result)
# Interpretation
bf_result$bayes_factor # BF_{1,2}
bf_result$log_bf # Log Bayes Factor
bf_result$interpretation # Text interpretation
# For 3+ models: pairwise comparisons
bf_multi <- bayes_factor_comparison(fit1, fit2, fit3, method = "bridge_sampling")
bf_multi$pairwise_comparisonsBayes Factor Interpretation (Kass & Raftery, 1995): - BF < 1: Evidence for Model 2 - 1-3: Weak evidence for Model 1 - 3-10: Moderate evidence - 10-30: Strong evidence - > 30: Very strong evidence
The predictive_performance() function evaluates
out-of-sample predictions:
# In-sample performance
perf_in <- predictive_performance(
model = fit,
newdata = NULL, # NULL = use training data
observed_y = data$y,
metrics = "all", # RMSE, MAE, Coverage, CRPS
credible_level = 0.95,
n_draws = NULL
)
print(perf_in)
plot(perf_in)
# Out-of-sample performance
test_data <- data.frame(x1 = rnorm(50), x2 = rnorm(50), y = rnorm(50))
perf_out <- predictive_performance(
model = fit,
newdata = test_data,
observed_y = test_data$y,
metrics = "all"
)
# Compare metrics
perf_in$point_metrics # RMSE, MAE, Correlation
perf_in$interval_metrics # Coverage, Interval Width
perf_in$proper_scores # CRPS (lower is better)
perf_in$prediction_summary # Per-observation predictionsKey metrics: - RMSE/MAE: Prediction error (lower is better) - Coverage: % of observations within credible intervals (should ≈ 0.95) - CRPS: Continuous Ranked Probability Score (proper scoring rule)
The extract_posterior_unified() provides consistent
posterior extraction across packages:
# Extract as data frame
draws_df <- extract_posterior_unified(
model = fit,
parameters = c("b_x1", "b_x2", "sigma"),
format = "draws_df",
ndraws = 1000,
include_warmup = FALSE,
chains = NULL # All chains
)
# Extract as matrix
draws_matrix <- extract_posterior_unified(fit, format = "draws_matrix")
# Extract as array (iterations × chains × parameters)
draws_array <- extract_posterior_unified(fit, format = "draws_array")
# Extract as named list
draws_list <- extract_posterior_unified(fit, format = "list")Supported model types: - brmsfit (brms)
- stanfit (rstan) - stanreg (rstanarm) -
CmdStanMCMC (cmdstanr) - mcmc.list (coda)
The diagnostic_report() function creates comprehensive
HTML/PDF reports:
# Generate comprehensive report
diagnostic_report(
model = fit,
output_file = "bayesian_model_diagnostics.pdf",
output_format = "pdf", # or "html", "docx"
include_sections = c(
"model_summary",
"convergence",
"posterior_summary",
"recommendations"
),
rhat_threshold = 1.01,
ess_threshold = 0.1,
open_report = TRUE # Open automatically
)Sections included: 1. Model Summary: Formula, family, data info 2. Convergence Diagnostics: R-hat, ESS, divergences 3. Posterior Summary: Parameter estimates with credible intervals 4. Recommendations: Actionable suggestions for model improvement
Here’s a complete Bayesian workflow using all major functions:
# ===== STEP 1: FIT MODEL =====
library(bayesDiagnostics)
library(brms)
# Simulate data
set.seed(123)
N <- 200
data <- data.frame(
y = rnorm(N, mean = 3 + 2*rnorm(N) - 0.5*rnorm(N), sd = 1.5),
x1 = rnorm(N),
x2 = rnorm(N)
)
# Fit Bayesian model
fit <- brm(
y ~ x1 + x2,
data = data,
prior = c(
prior(normal(0, 5), class = "b"),
prior(student_t(3, 0, 2.5), class = "sigma")
),
chains = 4,
iter = 2000,
warmup = 1000,
refresh = 0
)
# ===== STEP 2: CONVERGENCE DIAGNOSTICS =====
# Quick check
diag <- mcmc_diagnostics_summary(fit)
print(diag)
# Detailed ESS analysis
ess_diag <- effective_sample_size_diagnostics(fit, plot = TRUE)
plot(ess_diag)
# ===== STEP 3: POSTERIOR PREDICTIVE CHECKS =====
# Comprehensive PPC
ppc <- posterior_predictive_check(fit, observed_data = data$y, plot = TRUE)
print(ppc)
# Automated screening
auto_ppc <- automated_ppc(fit, data$y)
print(auto_ppc)
# LOO cross-validation
loo_ppc <- ppc_crossvalidation(fit, data$y)
plot(loo_ppc)
# ===== STEP 4: PRIOR SENSITIVITY =====
# Define alternative priors
prior_grid <- list(
weak = set_prior("normal(0, 10)", class = "b"),
moderate = set_prior("normal(0, 5)", class = "b"),
strong = set_prior("normal(0, 1)", class = "b")
)
# Test sensitivity
sens <- prior_sensitivity(
fit,
parameters = c("b_x1", "b_x2"),
prior_grid = prior_grid,
plot = TRUE
)
print(sens)
# ===== STEP 5: MODEL COMPARISON =====
# Fit alternative models
fit_x1 <- brm(y ~ x1, data = data, refresh = 0)
fit_x1x2 <- fit
fit_interaction <- brm(y ~ x1 * x2, data = data, refresh = 0)
# Compare
comp <- model_comparison_suite(
Linear = fit_x1,
Additive = fit_x1x2,
Interaction = fit_interaction,
criterion = "all",
plot = TRUE
)
print(comp)
# ===== STEP 6: PREDICTIVE PERFORMANCE =====
# Hold-out validation
test_idx <- sample(1:N, 40)
test_data <- data[test_idx, ]
train_data <- data[-test_idx, ]
fit_train <- update(fit, newdata = train_data, refresh = 0)
perf <- predictive_performance(
fit_train,
newdata = test_data,
observed_y = test_data$y,
metrics = "all"
)
print(perf)
plot(perf)
# ===== STEP 7: GENERATE REPORT =====
diagnostic_report(
fit,
output_file = "full_diagnostics.html",
output_format = "html",
include_sections = c("model_summary", "convergence",
"posterior_summary", "recommendations")
)adapt_delta or reparameterize| Function | Category | Purpose |
|---|---|---|
mcmc_diagnostics_summary() |
Convergence | Quick MCMC diagnostic overview |
effective_sample_size_diagnostics() |
Convergence | Detailed ESS analysis with plots |
hierarchical_convergence() |
Convergence | Hierarchical model convergence |
posterior_predictive_check() |
PPC | Comprehensive PPC with test statistics |
automated_ppc() |
PPC | Automated screening across statistics |
graphical_ppc() |
PPC | Publication-quality PPC plots |
ppc_crossvalidation() |
PPC | LOO-PIT cross-validation checks |
bayesian_p_values() |
PPC | Custom test statistic p-values |
prior_elicitation_helper() |
Prior | Translate expert knowledge to priors |
prior_sensitivity() |
Prior | Assess robustness to prior choice |
prior_robustness() |
Prior | Multi-dimensional sensitivity analysis |
model_comparison_suite() |
Comparison | Compare models via LOO/WAIC/R² |
bayes_factor_comparison() |
Comparison | Bayes Factor model comparison |
predictive_performance() |
Comparison | Evaluate predictive accuracy |
extract_posterior_unified() |
Utility | Unified posterior extraction |
diagnostic_report() |
Utility | Generate comprehensive reports |
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.