How do I calculate estimated marginal means using a multiply imputed dataset?
Calculating estimated marginal means with a single dataset using the effects
package works fine. However, I don't know how to calculate marginal effects using multiply imputed datasets.
Here is what I've tried:
First I fit a logistic regression and calculate marginal effects using the allEffects
function from the effects
package with the Cowles
dataset from the carData
package.
library(mice)
library(effects)
fit <- glm(volunteer ~ sex + extraversion, data = Cowles, family = binomial)
res <- setNames(data.frame(coef(fit), confint(fit)), c("estimate", "ll", "ul"))
res_eff <- as.data.frame(allEffects(fit))
I get the marginal effects for sex
$sex
sex fit se lower upper
1 female 0.4458008 0.01793426 0.4109776 0.4811644
2 male 0.3859734 0.01938009 0.3487490 0.4245806
Now I introduce missings in the explanatory variables in the Cowles
dataset...
Cowles_NA <- Cowles
set.seed(42)
Cowles_NA[,-4] <- do.call(cbind.data.frame,
lapply(Cowles_NA[,-4], function(x) {
n <- nrow(Cowles_NA)
x[sample(c(1:n),floor(n/10))]<-NA
x
})
)
... impute two datasets using the mice
package...
Cowles_mi <- mice(Cowles_NA, m = 2, seed = 42)
... fit logistic regression models and pool the estimates.
fit_mi <- with(Cowles_mi, glm(volunteer ~ sex + extraversion, family = binomial))
res_mi <- summary(pool(fit_mi), conf.int = TRUE)[, c("estimate", "2.5 %", "97.5 %" )]
This works fine, however when I try to calculate the marginal effects with fit_mi I get an error message.
allEffects(fit_mi)
"Error in terms.default(model) : no terms component nor attribute"
My idea was now to calculate the marginal effects manually which works fine for the category female
of the variable sex
.
res_eff_f_manual <- res["(Intercept)", "estimate"] + res["extraversion", "estimate"] * mean(Cowles$extraversion)
(res_eff_f_manual <- exp(res_eff_f_manual) / (exp(res_eff_f_manual) + 1))
[1] 0.4458008
(res_eff_f_package <- res_eff$sex[res_eff$sex == "female", "fit"])
[1] 0.4458008
However, when I do the same thing for the confidence interval I get a CI that is far too wide (lower level that is far too low).
res_eff_f_ll_manual <- res["(Intercept)", "ll"] + res["extraversion", "ll"] * mean(Cowles$extraversion)
(res_eff_f_ll_manual <- exp(res_eff_f_ll_manual) / (exp(res_eff_f_ll_manual) + 1))
[1] 0.2812161
res_eff$sex[res_eff$sex == "female", "lower"]
[1] 0.4109776
Are there any packages that allow calculating estimated marginal means using imputed data (if possible, produced with mice) or is there another way to calculate the marginal effects with multiply imputed data?