pool.compare: Compare two nested models fitted to imputed data

Description

Compares two nested models after m repeated complete data analysis

Usage

pool.compare(fit1, fit0, data = NULL, method = "Wald")

Arguments

fit1

An object of class 'mira', produced by with.mids().

fit0

An object of class 'mira', produced by with.mids(). The model in fit0 should be a submodel of fit1. Moreover, the variables of the submodel should be the first variables of the larger model and in the same order as in the submodel.

data

In case of method 'likelihood' it is necessary to pass also the original mids object to the data argument. Default value is NULL, in case of method='Wald'.

method

A string describing the method to compare the two models. Two kind of comparisons are included so far: 'Wald' and 'likelihood'.

Value

A list containing several components. Component call is that call to the pool.compare function. Component call11 is the call that created fit1. Component call12 is the call that created the imputations. Component call01 is the call that created fit0. Compenent call02 is the call that created the imputations. Components method is the method used to compare two models: 'Wald' or 'likelihood'. Component nmis is the number of missing entries for each variable. Component m is the number of imputations. Component qhat1 is a matrix, containing the estimated coeffients of the m repeated complete data analyses from fit1. Component qhat0 is a matrix, containing the estimated coeffients of the m repeated complete data analyses from fit0. Component ubar1 is the mean of the variances of fit1, formula (3.1.3), Rubin (1987). Component ubar0 is the mean of the variances of fit0, formula (3.1.3), Rubin (1987). Component qbar1 is the pooled estimate of fit1, formula (3.1.2) Rubin (1987). Component qbar0 is the pooled estimate of fit0, formula (3.1.2) Rubin (1987). Component Dm is the test statistic. Component rm is the relative increase in variance due to nonresponse, formula (3.1.7), Rubin (1987). Component df1: df1 = under the null hypothesis it is assumed that Dm has an F distribution with (df1,df2) degrees of freedom. Component df2: df2. Component pvalue is the P-value of testing whether the larger model is statistically different from the smaller submodel.

Details

The function is based on the article of Meng and Rubin (1992). The Wald-method can be found in paragraph 2.2 and the likelihood method can be found in paragraph 3. One could use the Wald method for comparison of linear models obtained with e.g. lm (in with.mids()). The likelihood method should be used in case of logistic regression models obtaind with glm() in with.mids(). It is assumed that fit1 contains the larger model and the model in fit0 is fully contained in fit1. In case of method='Wald', the null hypothesis is tested that the extra parameters are all zero.

References

Li, K.H., Meng, X.L., Raghunathan, T.E. and Rubin, D. B. (1991). Significance levels from repeated p-values with multiply-imputed data. Statistica Sinica, 1, 65-92.

Meng, X.L. and Rubin, D.B. (1992). Performing likelihood ratio tests with multiple-imputed data sets. Biometrika, 79, 103-111.

van Buuren S and Groothuis-Oudshoorn K (2011). mice: Multivariate Imputation by Chained Equations in R. Journal of Statistical Software, 45(3), 1-67. http://www.jstatsoft.org/v45/i03/

Examples

Run this code

### To compare two linear models:
imp <- mice(nhanes2)
mi1 <- with(data=imp, expr=lm(bmi~age+hyp+chl))
mi0 <- with(data=imp, expr=lm(bmi~age+hyp))
pc  <- pool.compare(mi1, mi0, method='Wald')
pc$spvalue
#            [,1]
#[1,] 0.000293631
#

### Comparison of two general linear models (logistic regression).
## Not run: 
# imp  <- mice(boys, maxit=2)
# fit0 <- with(imp, glm(gen>levels(gen)[1] ~ hgt+hc,family=binomial))
# fit1 <- with(imp, glm(gen>levels(gen)[1] ~ hgt+hc+reg,family=binomial))
# pool.compare(fit1, fit0, method='likelihood', data=imp)## End(Not run)

Run the code above in your browser using DataLab