Let’s investigate the accuracy of opm’s parameter estimates on 200 simulated datasets.
First, let’s define the parameters used by the data-generating process:
rho <- .5
beta <- .5
sig2 <- 1
The following function generates a synthetic dataset of desired dimensions (N cases and T time points) and distribution parameters (\(\rho = \texttt{rho}\), \(\beta = \texttt{beta}\), and \(\sigma^2 = \texttt{sig2}\)):
generate <- function(N, T, rho, beta, sig2) {
f <- runif(N, -2, 2)
K <- length(beta)
beta <- matrix(beta, K, 1)
x <- array(.75*f, dim=c(N, K, T)) + rnorm(N*K*T, sd = 1)
y <- matrix(0, N, T)
for (t in 1:T) {
yy <- if (t>1) y[,t-1] else 0
y[,t] <- rho * yy + f + x[,,t] %*% beta + rnorm(N, sd = sqrt(sig2))
}
data.frame(i = rep(seq(N), T),
t = rep(seq(T), each = N),
as.data.frame(matrix(aperm(x, c(1, 3, 2)), N*T, K,
dimnames = list(NULL, paste0('x', seq(K))))),
y = c(y))
}
Now we generate a dataset with N=1000 cases and T=3 time points and fit the model to it 200 times:
library(OrthoPanels)
library(knitr)
N <- 1000
T <- 3
reps <- 200
set.seed(123)
opms <- replicate(n = reps,
opm(y~x1,
data = generate(N = N, T = T,
rho = rho,
beta = beta,
sig2 = sig2),
n.samp = 1000),
simplify = FALSE)
Let’s check the sampled parameters:
true_param <- c(rho = rho, sig2 = sig2, beta = beta)
est_param <- sapply(opms, coef)
resid <- sweep(est_param, 1, true_param)
rmse <- sqrt(rowMeans(resid^2))
kable(rbind(`True` = true_param,
`Est` = rowMeans(est_param),
`Bias` = rowMeans(resid),
`RMSE` = rmse))
rho | sig2 | beta | |
---|---|---|---|
True | 0.5000000 | 1.0000000 | 0.5000000 |
Est | 0.5043025 | 1.0090909 | 0.5025456 |
Bias | 0.0043025 | 0.0090909 | 0.0025456 |
RMSE | 0.0556957 | 0.0704548 | 0.0352854 |
Density plot for each parameter, with true value marked with a vertical line:
plot(density(sapply(opms, coef)[1,]),
main = 'Density of median of posterior samples of rho')
abline(v = rho, col='darkred')
plot(density(sapply(opms, coef)[2,]),
main = 'Density of median of posterior samples of sig2')
abline(v = sig2, col='darkred')
plot(density(sapply(opms, coef)[3,]),
main = 'Density of median of posterior samples of beta')
abline(v = beta, col='darkred')
The proportion of time the 95% credible interval includes the true value of the parameter:
cis <- sapply(lapply(opms, confint),
function(ci) {
ci[,'2.5%'] <= c(rho, sig2, beta) &
ci[,'97.5%'] >= c(rho, sig2, beta)
})
kable(rowSums(cis) / reps)
rho | 0.895 |
sig2 | 0.920 |
beta | 0.940 |