mice(data, m = 5,
imputationMethod = vector("character",length=ncol(data)),
predictorMatrix = (1 - diag(1, ncol(data))),
visitSequence = (1:ncol(data))[apply(is.na(data),2,any)],
defaultImputationMethod=c("pmm","logreg","polyreg"),
maxit = 5,
diagnostics = TRUE,
printFlag = TRUE,
seed = NA)
ncol(data)
containing 0/1 data specifying
the set of predictors to be used for each target column. Rows correspond
to target variables (i.e. variables to be imputed), in the sequence as
they appear in data. A value TRUE
, diagnostic
information will be appended to the value of the function. If
FALSE
, only the imputed data are saved. The default is TRUE
.mids
.~
mechanism. This
method can be used to ensure that a data transform always depends on the
most recently generated imputations in the untransformed (active)
column.
The data may contain categorical variables that are used in a
regressions on other variables. The algorithm creates dummy variables
for the categories of these variables, and imputes these from the
corresponding categorical variable.
Built-in imputation methods are:
[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
For example, for the j'th column, the impute.norm
function that implements the
Bayesian linear regression method can be called by specifying the string "norm"
as the j'th entry in the vector of strings.
The user can write his or her own imputation function, say
impute.myfunc
, and call it for all columns by specifying
imputationMethod="myfunc"
, or for specific columns by specifying
imputationMethod=c("norm","myfunc",...)
.
side effects:
Some elementary imputation method require access to the nnet or MASS
libraries of Venables & Ripley. Where needed, these libraries will be
attached.complete
, mids
, lm.mids
, set.seed
data(nhanes)
imp <- mice(nhanes) # do default multiple imputation on a numeric matrix
imp
imp$imputations$bmi # and list the actual imputations
complete(imp) # show the first completed data matrix
lm.mids(chl~age+bmi+hyp, imp) # repeated linear regression on imputed data
data(nhanes2)
mice(nhanes2,im=c("sample","pmm","logreg","norm")) # imputation on mixed data with a different method per column
Run the code above in your browser using DataLab