The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.
The StanMoMo
package includes two methods for model
selection and model averaging based on leave-future-out validation,
called stacking and pseudo-BMA. First we briefly
discuss why the standard Bayesian model averaging and leave-one-out
cross-validation should be avoided for mortality forecasting. Second, we
explain with some mathematical details how the methods of
stacking and pseudo-BMA can be used for mortality
forecasting based on future-out validation.
Instead of choosing one model, model averaging stems from the idea that a combination of candidate models among a model list \(\mathcal{M}=(M_1,\dots,M_K)\) may perform better than one single model. The standard Bayesian approach, called (BMA), consists in weighing each model by its posterior model evidence. This approach should be avoided for mortality forecasting for several reasons. Among them, BMA is very sensitive to prior choices and tends to select only one model asymptotically (see e.g. Fernandez et al. (2001)). Moreover, like the Bayes Information Criterion (BIC), BMA measures how well the model fits the past but not how well the model predicts the future.
As an alternative approach, different authors considered model selection and averaging based on prediction performance on hold-out data. For instance, Yao et al. (2018) proposed Bayesian model averaging approaches based on leave-one-out cross-validation (LOO-CV). While this method is valid as a measure of fit, LOO-CV is problematic for forecasting.
Indeed, leaving out only one observation at a time will allow information from the future to influence predictions of the past (i.e., data from times \(t + 1, t + 2, \dots,\) would inform predictions for time \(t\)). Instead, it is more appropriate to use leave-future-out validation. In our context of mortality forecasting, instead of leaving one point out, we leave the last \(M\) years of data out and evaluate the prediction performance on these \(M\) years. This approach proposed in Barigou et al. (2023) is further discussed in the next section.
In order to select the weights by validation, the data is split into a training set and a validation set as follows:
After fitting the mortality model to \(y_{1:N}\), we can obtain an empirical distribution of the future death rates \(\mu_{x,t}\) for \(t=t_{N+1},\dots,t_{N+M}\) based on the MCMC samples obtained via Stan. The stacking approach searches for the averaged model which maximizes the likelihood of the observed number of deaths on the validation set. In simple words, if the validation length is \(M=10\), the stacking approach will obtain the best forecasting averaged model for the last 10 years of data. The stacking maximization problem can be expressed as \[ \max _{w \in \mathcal{S}_{1}^{K}} \sum_{x=x_1}^{x_n} \sum_{j=t_{N+1}}^{t_{N+M}} \log \sum_{k=1}^{K} w_{k} p\left(d_{x,j} \mid y_{1: N},M_k\right), \] where
By construction, the averaged distribution maximizes the log likelihood of the observed number of deaths in the validation set among all distributions in the convex hull: \[\mathcal{C}=\left\{\sum_{k=1}^{K} w_{k} \times p\left(\cdot \mid M_{k}\right): \sum_{k} w_{k}=1, w_{k} \geq 0\right\}\]
As an alternative approach, we consider an AIC-type weighting using leave-future-out validation, called pseudo-BMA. This method is based on the expected log predictive density (elpd) which is an aggregate measure of fit which is in our mortality model given by \[ \operatorname{elpd}^{k}= \sum_{x=x_1}^{x_n} \sum_{j=t_{N+1}}^{t_{N+M}} \log p\left(d_{x,j} \mid y_{1: N},M_k\right). \] Typically, \(\operatorname{elpd}^{k}\) will be higher if the model \(M_k\) better predicts the observed deaths in the validation set. The pseudo-BMA weighting is then given by \[ w_{k}=\frac{\exp \left(\operatorname{elpd}^{k}\right)}{\sum_{k=1}^{K} \exp \left(\operatorname{elpd}^{k}\right)}. \] Hence, the pseudo-BMA finds the relative weights proportional to the elpd of each model. As discussed in our paper Barigou et al. (2021), while this method provides a simple weight mechanism, stacking tends to outperform pseudo-BMA.
The package includes the mortality_weights
function
which performs Bayesian model averaging via stacking and pseudo-BMA.
First, the user fits the different models with an extra argument
validation
which represents the number of years of
validation. In a second step, the user calls the
mortality_weights
function with the list of model fits as
argument.
For instance, if we want to apply the averaging methods to the LC, APC and RH models to French data with 10 years of validation for ages 50-90 and years 1970-2017, the weights can be obtained via
#We extract deaths and exposures
ages.fit<-50:90
years.fit<-1970:2017
deathFR<-FRMaleData$Dxt[formatC(ages.fit),formatC(years.fit)]
exposureFR<-FRMaleData$Ext[formatC(ages.fit),formatC(years.fit)]
#We fit the three mortality models
fitLC=lc_stan(death = deathFR,exposure=exposureFR, forecast = 10, validation=10,family = "poisson",cores=4)
fitRH=rh_stan(death = deathFR,exposure=exposureFR, forecast = 10, validation=10,family = "poisson",cores=4)
fitAPC=apc_stan(death = deathFR,exposure=exposureFR, forecast = 10, validation=10,family = "poisson",cores=4)
#We compute the model weights
model_weights<-mortality_weights(list(fitLC,fitRH,fitAPC))
Fernandez, C., Ley, E., & Steel, M. F. (2001). Benchmark priors for Bayesian model averaging. Journal of Econometrics, 100(2), 381-427.
Yao, Y., Vehtari, A., Simpson, D., & Gelman, A. (2018). Using stacking to average Bayesian predictive distributions (with discussion). Bayesian Analysis, 13(3), 917-1007.
Barigou, K., Goffard, P. O., Loisel, S., & Salhi, Y. (2023). Bayesian model averaging for mortality forecasting using leave-future-out validation. International Journal of Forecasting, 39(2), 674-690.
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.