pool.compare: Compare two nested models fitted to imputed data

Description

Compares two nested models after m repeated complete data analysis

Usage

pool.compare(fit1, fit0, method = c("wald", "likelihood"), data = NULL)

Arguments

fit1

An object of class 'mira', produced by with.mids().

fit0

An object of class 'mira', produced by with.mids(). The model in fit0 is a nested fit0 of fit1.

method

Either "wald" or "likelihood" specifying the type of comparison. The default is "wald".

data

No longer used.

Value

A list containing several components. Component call is the call to the pool.compare function. Component call11 is the call that created fit1. Component call12 is the call that created the imputations. Component call01 is the call that created fit0. Component call02 is the call that created the imputations. Components method is the method used to compare two models: 'Wald' or 'likelihood'. Component nmis is the number of missing entries for each variable. Component m is the number of imputations. Component qhat1 is a matrix, containing the estimated coefficients of the m repeated complete data analyses from fit1. Component qhat0 is a matrix, containing the estimated coefficients of the m repeated complete data analyses from fit0. Component ubar1 is the mean of the variances of fit1, formula (3.1.3), Rubin (1987). Component ubar0 is the mean of the variances of fit0, formula (3.1.3), Rubin (1987). Component qbar1 is the pooled estimate of fit1, formula (3.1.2) Rubin (1987). Component qbar0 is the pooled estimate of fit0, formula (3.1.2) Rubin (1987). Component Dm is the test statistic. Component rm is the relative increase in variance due to nonresponse, formula (3.1.7), Rubin (1987). Component df1: df1 = under the null hypothesis it is assumed that Dm has an F distribution with (df1,df2) degrees of freedom. Component df2: df2. Component pvalue is the P-value of testing whether the model fit1 is statistically different from the smaller fit0.

Details

The function is based on the article of Meng and Rubin (1992). The Wald-method can be found in paragraph 2.2 and the likelihood method can be found in paragraph 3. One could use the Wald method for comparison of linear models obtained with e.g. lm (in with.mids()). The likelihood method should be used in case of logistic regression models obtained with glm() in with.mids().

The function assumes that fit1 is the larger model, and that model fit0 is fully contained in fit1. In case of method='wald', the null hypothesis is tested that the extra parameters are all zero.

References

Li, K.H., Meng, X.L., Raghunathan, T.E. and Rubin, D. B. (1991). Significance levels from repeated p-values with multiply-imputed data. Statistica Sinica, 1, 65-92.

Meng, X.L. and Rubin, D.B. (1992). Performing likelihood ratio tests with multiple-imputed data sets. Biometrika, 79, 103-111.

van Buuren S and Groothuis-Oudshoorn K (2011). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45(3), 1-67. https://www.jstatsoft.org/v45/i03/

Examples

Run this code

# NOT RUN {
### To compare two linear models:
imp <- mice(nhanes2, seed = 51009, print = FALSE)
mi1 <- with(data = imp, expr = lm(bmi ~ age + hyp + chl))
mi0 <- with(data = imp, expr = lm(bmi ~ age + hyp))
pc  <- pool.compare(mi1, mi0)
pc$pvalue

### Comparison of two general linear models (logistic regression).
# }
# NOT RUN {
imp  <- mice(boys, maxit = 2, print = FALSE)
fit1 <- with(imp, glm(gen > levels(gen)[1] ~ hgt + hc + reg, family = binomial))
fit0 <- with(imp, glm(gen > levels(gen)[1] ~ hgt + hc, family = binomial))
pool.compare(fit1, fit0, method = 'likelihood')$pvalue

# using factors
fit1 <- with(imp, glm(as.factor(gen > levels(gen)[1]) ~ hgt + hc + reg, family = binomial))
fit0 <- with(imp, glm(as.factor(gen > levels(gen)[1]) ~ hgt + hc, family = binomial))
pool.compare(fit1, fit0, method = 'likelihood')$pvalue
# }

Run the code above in your browser using DataLab