This report documents the results of a simulation based calibration (SBC) run for RBesT
. The calibration data will be generated whenever relevant changes to the gMAP
function were made. The calibration runs are performed for typical use cases of gMAP
. These include the three likelihoods (binomial, gaussian & Poisson), a sparse (\(2\) trials) and dense (\(10\) trials) data situation and finally a run with a very/less conservative prior choice for between-trial heterogeniety parameter.
The calibration data presented here has been generated at and with the RBesT
git version as:
## Created: 2019-04-03 11:42:46 UTC
## git hash: 0e3388f5b14e585ced738cdb192b2488916673e7
## MD5: 55e17f06123c4320519508161ce81cf7
The MD5 hash of the calibration data file presented here must match the above listed MD5:
## calibration.rds
## "55e17f06123c4320519508161ce81cf7"
Simulation based calibration (SBC) is a necessary condition which must be met for any Bayesian analysis with proper priors. The details are presented in Talts, et. al (see https://arxiv.org/abs/1804.06788).
Self-consistency of any Bayesian analysis with a proper prior:
\[ p(\theta) = \iint \mbox{d}\tilde{y} \, \mbox{d}\tilde{\theta} \, p(\theta|\tilde{y}) \, p(\tilde{y}|\tilde{\theta}) \, p(\tilde{\theta}) \] \[ \Leftrightarrow p(\theta) = \iint \mbox{d}\tilde{y} \, \mbox{d}\tilde{\theta} \, p(\theta,\tilde{y},\tilde{\theta}) \]
SBC procedure:
Repeat \(s=1, ..., S\) times:
Sample from the prior \[\tilde{\theta} \sim p(\theta)\]
Sample fake data \[\tilde{y} \sim p(y|\tilde{\theta})\]
Obtain \(L\) posterior samples \[\{\theta_1, ..., \theta_L\} \sim p(\tilde{\theta}|\tilde{y})\]
Calculate the rank \(r_s\) of the prior draw \(\tilde{\theta}\) wrt to the posterior sample \(\{\theta_1, ..., \theta_L\} \sim p(\tilde{\theta}|\tilde{y})\) which falls into the range \([0,L]\) out of the possible \(L+1\) ranks. The rank is calculated as \[r_s = \sum_{l=1}^L \mathbb{I}[ \theta_l < \tilde{\theta}]\]
The \(S\) ranks then form a uniform \(0-1\) density and the count in each bin has a binomial distribution with probability of \[p(r \in \mbox{Any Bin}) =\frac{(L+1)}{S}.\]
Likelihood:
Hierarchical prior:
\[ g(\theta_j)|\mu,\tau \sim \mbox{Normal}(\mu, \tau^2)\]
\[\mu \sim \mbox{Normal}(m_\mu, s^2_\mu)\] \[\tau \sim \mbox{Normal}^+(0, s^2_\tau)\]
The fake data simulation function returns for binomial and Poisson data the sum of the responses while for normal the mean summary is used. Please refer to the sbc_tools.R
and make_reference_rankhist.R
R programs for the implementation details.
The reference runs are created with \(L=1023\) posterior draws for each replication and a total of \(S=10^4\) replications are run per case. For the evaluation here the results are reduced to \(B=L'+1=64\) bins to ensure a sufficiently large sample size per bin.
problem | likelihood | sd_tau | statistic | df | p.value |
---|---|---|---|---|---|
dense | binomial | 0.5 | 53.248 | 63 | 0.805 |
dense | binomial | 1 | 51.046 | 63 | 0.860 |
dense | gaussian | 0.5 | 77.350 | 63 | 0.105 |
dense | gaussian | 1 | 49.395 | 63 | 0.895 |
dense | poisson | 0.5 | 45.197 | 63 | 0.956 |
dense | poisson | 1 | 58.342 | 63 | 0.643 |
sparse | binomial | 0.5 | 45.312 | 63 | 0.955 |
sparse | binomial | 1 | 54.848 | 63 | 0.758 |
sparse | gaussian | 0.5 | 60.352 | 63 | 0.571 |
sparse | gaussian | 1 | 55.770 | 63 | 0.729 |
sparse | poisson | 0.5 | 61.914 | 63 | 0.515 |
sparse | poisson | 1 | 70.195 | 63 | 0.249 |
problem | likelihood | sd_tau | statistic | df | p.value |
---|---|---|---|---|---|
dense | binomial | 0.5 | 64.038 | 63 | 0.440 |
dense | binomial | 1 | 65.715 | 63 | 0.383 |
dense | gaussian | 0.5 | 67.661 | 63 | 0.321 |
dense | gaussian | 1 | 57.830 | 63 | 0.661 |
dense | poisson | 0.5 | 72.358 | 63 | 0.196 |
dense | poisson | 1 | 47.373 | 63 | 0.929 |
sparse | binomial | 0.5 | 72.205 | 63 | 0.200 |
sparse | binomial | 1 | 68.979 | 63 | 0.282 |
sparse | gaussian | 0.5 | 56.026 | 63 | 0.721 |
sparse | gaussian | 1 | 56.435 | 63 | 0.708 |
sparse | poisson | 0.5 | 58.573 | 63 | 0.635 |
sparse | poisson | 1 | 64.858 | 63 | 0.412 |
## R version 3.5.1 (2018-07-02)
## Platform: x86_64-pc-linux-gnu (64-bit)
## Running under: Ubuntu 16.04.5 LTS
##
## Matrix products: default
## BLAS: /usr/lib/libblas/libblas.so.3.6.0
## LAPACK: /usr/lib/lapack/liblapack.so.3.6.0
##
## locale:
## [1] C
##
## attached base packages:
## [1] tools stats graphics grDevices utils datasets methods
## [8] base
##
## other attached packages:
## [1] bindrcpp_0.2.2 rstan_2.18.2 StanHeaders_2.18.0-1
## [4] RBesT_1.3-8 testthat_2.0.1 Rcpp_1.0.0
## [7] usethis_1.4.0 devtools_2.0.1 ggplot2_3.1.0
## [10] broom_0.5.1 tidyr_0.8.2 dplyr_0.7.8
## [13] assertthat_0.2.0 knitr_1.21 rmarkdown_1.11
##
## loaded via a namespace (and not attached):
## [1] mvtnorm_1.0-8 lattice_0.20-38 prettyunits_1.0.2
## [4] ps_1.2.1 rprojroot_1.3-2 digest_0.6.18
## [7] R6_2.3.0 plyr_1.8.4 ggridges_0.5.1
## [10] backports_1.1.3 stats4_3.5.1 evaluate_0.12
## [13] highr_0.7 pillar_1.3.0 rlang_0.3.0.1
## [16] lazyeval_0.2.1 rstudioapi_0.8 callr_3.1.0
## [19] checkmate_1.8.5 labeling_0.3 desc_1.2.0
## [22] stringr_1.3.1 loo_2.0.0 munsell_0.5.0
## [25] compiler_3.5.1 xfun_0.4 pkgconfig_2.0.2
## [28] pkgbuild_1.0.2 htmltools_0.3.6 tidyselect_0.2.5
## [31] tibble_1.4.2 gridExtra_2.3 codetools_0.2-15
## [34] matrixStats_0.54.0 crayon_1.3.4 withr_2.1.2
## [37] grid_3.5.1 nlme_3.1-137 gtable_0.2.0
## [40] magrittr_1.5 scales_1.0.0 cli_1.0.1
## [43] stringi_1.2.4 reshape2_1.4.3 fs_1.2.6
## [46] remotes_2.0.2 generics_0.0.2 Formula_1.2-3
## [49] glue_1.3.0 purrr_0.2.5 processx_3.2.1
## [52] pkgload_1.0.2 parallel_3.5.1 yaml_2.2.0
## [55] inline_0.3.15 colorspace_1.3-2 sessioninfo_1.1.1
## [58] bayesplot_1.6.0 memoise_1.1.0 bindr_0.1.1