mira
objectsmodel_parameters()
can be used in combination with the mice package to deal with missing data, in particular to summaries regression models used with multiple imputed datasets. It computes pooled summaries of multiple imputed repeated regression analyses, i.e. of objects of class mira
. Thus, model_parameters()
for mira
-objects is comparable to the pool()
-function from mice, but only focuses on the final summary of parameters and does not include the diagnostic statistic per estimate.
#>
#> iter imp variable
#> 1 1 bmi hyp chl
#> 1 2 bmi hyp chl
#> 1 3 bmi hyp chl
#> 1 4 bmi hyp chl
#> 1 5 bmi hyp chl
#> 2 1 bmi hyp chl
#> 2 2 bmi hyp chl
#> 2 3 bmi hyp chl
#> 2 4 bmi hyp chl
#> 2 5 bmi hyp chl
#> 3 1 bmi hyp chl
#> 3 2 bmi hyp chl
#> 3 3 bmi hyp chl
#> 3 4 bmi hyp chl
#> 3 5 bmi hyp chl
#> 4 1 bmi hyp chl
#> 4 2 bmi hyp chl
#> 4 3 bmi hyp chl
#> 4 4 bmi hyp chl
#> 4 5 bmi hyp chl
#> 5 1 bmi hyp chl
#> 5 2 bmi hyp chl
#> 5 3 bmi hyp chl
#> 5 4 bmi hyp chl
#> 5 5 bmi hyp chl
#> Parameter | Coefficient | SE | 95% CI | t | df | p
#> ---------------------------------------------------------------------------
#> (Intercept) | 19.33 | 4.26 | [ 10.39, 28.27] | 4.54 | 18.26 | < .001
#> age [40-59] | -6.32 | 1.95 | [-10.40, -2.23] | -3.25 | 18.26 | 0.004
#> age [60-99] | -7.61 | 2.79 | [-13.47, -1.75] | -2.72 | 18.26 | 0.014
#> hyp [yes] | 0.52 | 2.53 | [ -4.79, 5.83] | 0.21 | 18.26 | 0.840
#> chl | 0.06 | 0.02 | [ 0.01, 0.11] | 2.36 | 18.26 | 0.029
Not all packages work with with.mids()
from package mice. Thus, for some modeling packages, it’s not possible to perform multiply imputed repeated analyses, i.e. you cannot work with imputed data for such models. We give an example for the GLMMadaptive package here.
First, we generate a dataset with missing values. We take the data cbpp
from lme4 and randomly assign some missing values into one of the predictors. Then we impute the data, using mice()
from package mice.
library(lme4)
library(GLMMadaptive)
data(cbpp)
cbpp$period[sample(1:nrow(cbpp), size = 10)] <- NA
imputed_data <- mice(cbpp)
#>
#> iter imp variable
#> 1 1 period
#> 1 2 period
#> 1 3 period
#> 1 4 period
#> 1 5 period
#> 2 1 period
#> 2 2 period
#> 2 3 period
#> 2 4 period
#> 2 5 period
#> 3 1 period
#> 3 2 period
#> 3 3 period
#> 3 4 period
#> 3 5 period
#> 4 1 period
#> 4 2 period
#> 4 3 period
#> 4 4 period
#> 4 5 period
#> 5 1 period
#> 5 2 period
#> 5 3 period
#> 5 4 period
#> 5 5 period
Using with
to compute multiple regression analyses for each imputed dataset fails.
fit <- with(data = imputed_data, expr = GLMMadaptive::mixed_model(
cbind(incidence, size - incidence) ~ period,
random = ~ 1 | herd,
family = binomial
))
#> Error in as.data.frame(data) :
#> argument "data" is missing, with no default
However, we can use a workaround by calculating the regression models for each imputed dataset manually, using complete()
from package mice. Please note that it is important to mimic the structure of mira
-objects! The returned model objects have to be saved as a list named analyses
. This list must be of class mira
and list
, and can be used with model_parameters()
.
analyses <- as.list(seq_len(imputed_data$m))
for (i in seq_along(analyses)) {
data.i <- complete(imputed_data, i)
analyses[[i]] <- mixed_model(
cbind(incidence, size - incidence) ~ period,
random = ~ 1 | herd,
data = data.i,
family = binomial
)
}
object <- list(analyses = analyses)
class(object) <- c("mira", "matrix", "list")
model_parameters(object)
#> Parameter | Coefficient | SE | 95% CI | z | p
#> ------------------------------------------------------------------
#> (Intercept) | -1.61 | 0.27 | [-2.13, -1.08] | -6.00 | < .001
#> period [2] | -0.42 | 0.32 | [-1.05, 0.21] | -1.32 | 0.187
#> period [3] | -0.86 | 0.37 | [-1.58, -0.14] | -2.35 | 0.019
#> period [4] | -1.09 | 0.47 | [-2.00, -0.17] | -2.33 | 0.020
mipo
objectsIt is also possible to compute summaries of pooled objects of class mipo
.
#>
#> iter imp variable
#> 1 1 bmi hyp chl
#> 1 2 bmi hyp chl
#> 1 3 bmi hyp chl
#> 1 4 bmi hyp chl
#> 1 5 bmi hyp chl
#> 2 1 bmi hyp chl
#> 2 2 bmi hyp chl
#> 2 3 bmi hyp chl
#> 2 4 bmi hyp chl
#> 2 5 bmi hyp chl
#> 3 1 bmi hyp chl
#> 3 2 bmi hyp chl
#> 3 3 bmi hyp chl
#> 3 4 bmi hyp chl
#> 3 5 bmi hyp chl
#> 4 1 bmi hyp chl
#> 4 2 bmi hyp chl
#> 4 3 bmi hyp chl
#> 4 4 bmi hyp chl
#> 4 5 bmi hyp chl
#> 5 1 bmi hyp chl
#> 5 2 bmi hyp chl
#> 5 3 bmi hyp chl
#> 5 4 bmi hyp chl
#> 5 5 bmi hyp chl
fit <- with(data = imp, exp = lm(bmi ~ age + hyp + chl))
pooled <- pool(fit)
model_parameters(pooled)
#> Parameter | Coefficient | SE | 95% CI | t | df | p
#> ---------------------------------------------------------------------------
#> (Intercept) | 17.94 | 3.76 | [ 9.98, 25.90] | 4.77 | 16.44 | < .001
#> age [40-59] | -5.09 | 2.20 | [-10.08, -0.10] | -2.31 | 8.94 | 0.046
#> age [60-99] | -7.28 | 3.94 | [-19.69, 5.13] | -1.85 | 3.05 | 0.160
#> hyp [yes] | 1.56 | 2.90 | [ -6.36, 9.49] | 0.54 | 15.54 | 0.617
#> chl | 0.06 | 0.02 | [ 0.01, 0.10] | 2.75 | 4.15 | 0.015