MCMCvis
is an R package used to visualize, manipulate, and summarize MCMC output. MCMC output may be derived from Bayesian model output fit with JAGS, Stan, or other MCMC samplers.
The package contains four functions:
MCMCsummary
- summarize MCMC output for particular parameters of interestMCMCtrace
- create trace and density plots of MCMC chains for particular parameters of interestMCMCchains
- easily extract posterior chains from MCMC output for particular parameters of interestMCMCplot
- create caterpillar plots from MCMC output for particular parameters of interestMCMCvis
was designed to perform key functions for MCMC analysis using minimal code, in order to free up time/brainpower for interpretation of analysis results. Functions support simple and straightforward subsetting of model parameters within the calls, and produce presentable and ‘publication-ready’ output.
MCMCsummary
is used to output summary information from MCMC output. This function accepts stanfit
objects (created with the rstan
package), mcmc.list
objects (create with the rjags
or coda
packages), R2jags
output (created with the R2jags
package), and matrices of MCMC output (one chain per column). The function automatically detects the object type and proceeds accordingly. Two decimal places are reported by default. This can be changed using the digits
argument.
library(MCMCvis)
data(MCMC_data)
MCMCsummary(MCMC_data)
## mean sd 2.5% 50% 97.5% Rhat
## alpha[1] -10.45 5.32 -20.82 -10.48 0.01 1
## alpha[2] -11.92 0.50 -12.91 -11.93 -10.94 1
## alpha[3] -10.36 5.06 -20.28 -10.40 -0.34 1
## alpha[4] -11.91 4.49 -20.77 -11.90 -3.16 1
## alpha[5] -12.78 7.43 -27.23 -12.81 1.85 1
## alpha[6] -11.65 6.42 -24.27 -11.61 0.81 1
## alpha[7] -11.31 7.03 -25.08 -11.29 2.43 1
## alpha[8] -9.18 4.95 -18.88 -9.16 0.64 1
## alpha[9] -11.29 1.72 -14.66 -11.28 -7.89 1
## alpha[10] -9.43 9.55 -28.15 -9.44 9.41 1
## beta[1] -13.83 5.53 -24.67 -13.78 -2.96 1
## beta[2] -5.60 0.14 -5.88 -5.60 -5.32 1
## beta[3] -16.82 1.71 -20.15 -16.82 -13.45 1
## beta[4] -19.55 2.61 -24.59 -19.55 -14.41 1
## beta[5] 8.68 5.26 -1.66 8.69 19.04 1
## beta[6] 2.86 7.68 -12.04 2.83 17.93 1
## beta[7] 2.06 7.78 -13.11 2.01 17.27 1
## beta[8] -15.95 3.62 -23.08 -15.93 -8.93 1
## beta[9] 8.44 4.64 -0.69 8.44 17.58 1
## beta[10] 16.69 4.36 8.24 16.66 25.40 1
## gamma[1] -5.18 3.22 -11.51 -5.15 0.99 1
## gamma[2] 7.48 6.13 -4.53 7.47 19.36 1
## gamma[3] -4.16 4.70 -13.45 -4.11 4.98 1
## gamma[4] 15.80 12.20 -8.37 15.81 39.44 1
## gamma[5] -1.98 6.90 -15.41 -1.99 11.60 1
## gamma[6] 19.94 2.02 16.00 19.93 23.92 1
## gamma[7] -5.81 0.23 -6.26 -5.81 -5.36 1
## gamma[8] -3.53 6.74 -16.75 -3.54 9.80 1
## gamma[9] -0.93 5.67 -12.06 -0.92 10.26 1
## gamma[10] 13.10 0.09 12.91 13.10 13.28 1
Specific parameters can be specified to subset summary information. Square brackets are ignored by default. For instance, all alpha
parameters can be plotted using params = 'alpha'
.
MCMCsummary(MCMC_data,
params = 'alpha')
## mean sd 2.5% 50% 97.5% Rhat
## alpha[1] -10.45 5.32 -20.82 -10.48 0.01 1
## alpha[2] -11.92 0.50 -12.91 -11.93 -10.94 1
## alpha[3] -10.36 5.06 -20.28 -10.40 -0.34 1
## alpha[4] -11.91 4.49 -20.77 -11.90 -3.16 1
## alpha[5] -12.78 7.43 -27.23 -12.81 1.85 1
## alpha[6] -11.65 6.42 -24.27 -11.61 0.81 1
## alpha[7] -11.31 7.03 -25.08 -11.29 2.43 1
## alpha[8] -9.18 4.95 -18.88 -9.16 0.64 1
## alpha[9] -11.29 1.72 -14.66 -11.28 -7.89 1
## alpha[10] -9.43 9.55 -28.15 -9.44 9.41 1
Individual parameters can also be specified. For example, one alpha
(of many) may be specified. In this case, the square brackets should be ignored to only specify the alpha[1]
parameter. Use the argument ISB = FALSE
to specify particular parameters that contain brackets. ISB is short for ‘Ignore Square Brackets’.
MCMCsummary(MCMC_data,
params = 'alpha[1]',
ISB = FALSE)
## mean sd 2.5% 50% 97.5% Rhat
## alpha[1] -10.45 5.32 -20.82 -10.48 0.01 1
The excl
argument can be used to exclude any parameters. This can be used in conjunction with the params
argument. This is particularly useful when specifying ISB = FALSE
. For instance, if all alpha
parameters are desired except for alpha[1]
, params = 'alpha', excl = 'alpha[1]', ISB = FALSE
can be used. When ISB = TRUE
, an exact match of the specified parameter is required (excluding the square brackets). When ISB = FALSE
, partial names will be matched. Leaving the default (ISB = TRUE
) is generally recommended for simplicity. These arguments can be used in any of the functions in the package.
MCMCsummary(MCMC_data,
params = 'alpha',
excl = 'alpha[1]',
ISB = FALSE)
## mean sd 2.5% 50% 97.5% Rhat
## alpha[2] -11.92 0.50 -12.91 -11.93 -10.94 1
## alpha[3] -10.36 5.06 -20.28 -10.40 -0.34 1
## alpha[4] -11.91 4.49 -20.77 -11.90 -3.16 1
## alpha[5] -12.78 7.43 -27.23 -12.81 1.85 1
## alpha[6] -11.65 6.42 -24.27 -11.61 0.81 1
## alpha[7] -11.31 7.03 -25.08 -11.29 2.43 1
## alpha[8] -9.18 4.95 -18.88 -9.16 0.64 1
## alpha[9] -11.29 1.72 -14.66 -11.28 -7.89 1
## alpha[10] -9.43 9.55 -28.15 -9.44 9.41 1
Setting the Rhat
and n.eff
arguments to FALSE
can be used to avoid calculating the Rhat statistic and number of effective samples, respectively (default for Rhat
and n.eff
are TRUE
and FALSE
, respectively). Specifying FALSE
can greatly increase function speed with very large mcmc.list
objects. Values for Rhat near 1 suggest convergence (Brooks and Gelman 1998). Kruschke (2014) recommends n.eff > 10,000 for reasonably stable posterior estimates.
MCMCsummary(MCMC_data,
params = 'alpha',
Rhat = TRUE,
n.eff = TRUE)
## mean sd 2.5% 50% 97.5% Rhat n.eff
## alpha[1] -10.45 5.32 -20.82 -10.48 0.01 1 17745
## alpha[2] -11.92 0.50 -12.91 -11.93 -10.94 1 18000
## alpha[3] -10.36 5.06 -20.28 -10.40 -0.34 1 18358
## alpha[4] -11.91 4.49 -20.77 -11.90 -3.16 1 18638
## alpha[5] -12.78 7.43 -27.23 -12.81 1.85 1 18141
## alpha[6] -11.65 6.42 -24.27 -11.61 0.81 1 18000
## alpha[7] -11.31 7.03 -25.08 -11.29 2.43 1 17778
## alpha[8] -9.18 4.95 -18.88 -9.16 0.64 1 18000
## alpha[9] -11.29 1.72 -14.66 -11.28 -7.89 1 17996
## alpha[10] -9.43 9.55 -28.15 -9.44 9.41 1 17975
The func
argument can be used to return metrics of interest not already returned by default for MCMCsummary
. Input is a function to be performed on posteriors for each specified parameter. Values returned by function will be displayed as a column in the summary output (or multiple columns if the function returns more than one value). In this way, functions from other packages can be used to derives metrics of interest on posterior output.
MCMCsummary(MCMC_data,
params = 'alpha',
Rhat = TRUE,
n.eff = TRUE,
func = function(x) quantile(x, probs = c(0.01, 0.99)),
func_name = c('1%', '99%'))
## mean sd 2.5% 50% 97.5% Rhat n.eff 1% 99%
## alpha[1] -10.45 5.32 -20.82 -10.48 0.01 1 17745 -22.89 1.73
## alpha[2] -11.92 0.50 -12.91 -11.93 -10.94 1 18000 -13.09 -10.75
## alpha[3] -10.36 5.06 -20.28 -10.40 -0.34 1 18358 -21.99 1.49
## alpha[4] -11.91 4.49 -20.77 -11.90 -3.16 1 18638 -22.56 -1.48
## alpha[5] -12.78 7.43 -27.23 -12.81 1.85 1 18141 -29.98 4.91
## alpha[6] -11.65 6.42 -24.27 -11.61 0.81 1 18000 -26.74 2.98
## alpha[7] -11.31 7.03 -25.08 -11.29 2.43 1 17778 -27.61 5.14
## alpha[8] -9.18 4.95 -18.88 -9.16 0.64 1 18000 -20.67 2.48
## alpha[9] -11.29 1.72 -14.66 -11.28 -7.89 1 17996 -15.25 -7.28
## alpha[10] -9.43 9.55 -28.15 -9.44 9.41 1 17975 -31.83 12.89
MCMCtrace
is used to create trace and density plots for MCMC output. This is useful for diagnostic purposes. Particular parameters can also be specified, as with MCMCsummary
. Output is written to PDF by default to enable more efficient review of posteriors - this also reduces computation time. PDF output is particularly recommended for large numbers of parameters. pdf = FALSE
can be used to prevent output to PDF.
MCMCtrace(MCMC_data,
params = c('beta[1]', 'beta[2]', 'beta[3]'),
ISB = FALSE,
pdf = FALSE)
Just trace plot can be plotted with type = 'trace'
. Just density plots can be plotted with type = 'density'
. Default is type = 'both'
which outputs both trace and density plots. Individual chains for the density plot can be output using the ind
argument.
MCMCtrace(MCMC_data,
params = 'beta',
type = 'density',
ind = TRUE,
pdf = FALSE)
PDF document will be output to the current working directory by default, but another directory can be specified.
MCMCtrace(MCMC_data,
pdf = TRUE,
filename = 'MYpdf',
wd = 'DIRECTORY HERE')
iter
denotes how many iterations should be plotted for the chain the trace and density plots. The default is 5000, meaning that the last 5000 iterations of each chain are plotted. Remember, this is the final posterior chain, not including the specified burn-in (specified when the model was run). If less than 5000 iterations are run, the full number of iterations will be plotted.
MCMCtrace(MCMC_data,
params = c('beta[1]', 'beta[2]', 'beta[3]'),
ISB = FALSE,
iter = 1800,
ind = TRUE,
pdf = FALSE)
MCMCchains
is used to extract MCMC chains from MCMC objects. Chains can then be manipulated directly. Particular parameters can be specified as with other functions.
ex <- MCMCchains(MCMC_data,
params = 'beta')
#extract mean values for each parameter
apply(ex, 2, mean)
## beta[1] beta[2] beta[3] beta[4] beta[5] beta[6]
## -13.826007 -5.601035 -16.820354 -19.545768 8.678547 2.856033
## beta[7] beta[8] beta[9] beta[10]
## 2.055310 -15.953090 8.439410 16.685924
MCMCplot
is used to create caterpillar plots from MCMC output. Points represent posterior medians. For parameters where 50% credible intervals overlap 0 are indicated by ‘open’ circles. For parameters where 50 percent credible intervals DO NOT overlap 0 AND 95 percent credible intervals DO overlap 0 are indicated by ‘closed’ grey circles. For parameters where 95 percent credible intervals DO NOT overlap 0 are indicated by ‘closed’ black circles. Thick lines represent 50 percent credible intervals while thin lines represent 95 percent credible intervals.
As with the other functions in the package, particular parameters of interest can be specified.
MCMCplot(MCMC_data,
params = 'beta')
ref_ovl = FALSE
can be used to disable this feature. All median dots will be represented as ‘closed’ black circles. A vertical reference at 0 is plotted by default. The position of this reference line can be modified with the ref
argument. ref = NULL
removes the reference line altogether.
MCMCplot(MCMC_data,
params = 'beta',
ref_ovl = FALSE,
ref = NULL)
Parameters can be ranked by posterior median estimates using the rank
argument. xlab
can be used to create an alternative label for the x-axis.
MCMCplot(MCMC_data,
params = 'beta',
rank = TRUE,
xlab = 'ESTIMATE')
The orientation of the plot can also be change using the horiz
argument. ylab
is then used to specify an alternative label on the ‘estimate axis’.
MCMCplot(MCMC_data,
params = 'beta',
rank = TRUE,
horiz = FALSE,
ylab = 'ESTIMATE')
Graphical parameters for x and y-axis limitation, row labels, title, median dot size, CI line thickness, axis and tick thickness, text size, and margins can be specified.
MCMCplot(MCMC_data,
params = 'beta',
xlim = c(-60, 25),
xlab = 'My x-axis label',
main = 'MCMCvis plot',
labels = c('First param', 'Second param', 'Third param',
'Fourth param', 'Fifth param', 'Sixth param',
'Seventh param', 'Eighth param', 'Nineth param',
'Tenth param'),
labels_sz = 1.5,
med_sz = 2,
thick_sz = 7,
thin_sz = 3,
ax_sz = 4,
main_text_sz = 2)
Brooks, S. P., and A. Gelman. 1998. General methods for monitoring convergence of iterative simulations. Journal of Computational and Graphical Statistics 7:434.
Kruschke, J. 2014. Doing Bayesian data analysis: A tutorial with R, JAGS, and Stan. Academic Press.
For more information see ?MCMCplot