Learn R Programming

miceadds (version 2.2-0)

mice.impute.pls: Imputation using Partial Least Squares for Dimension Reduction

Description

This function imputes a variable with missing values using PLS regression (Mevik & Wehrens, 2007) for a dimension reduction of the predictor space.

Usage

mice.impute.pls(y, ry, x, type, pls.facs = NULL, pls.impMethod = "pmm", pls.impMethodArgs = NULL , pls.print.progress = TRUE, imputationWeights = rep(1, length(y)), pcamaxcols = 1E+09, min.int.cor = 0, min.all.cor=0, N.largest = 0, pls.title = NULL, print.dims = TRUE, pls.maxcols=5000, envir_pos = NULL , extract_data = TRUE , ...)

Arguments

y
Incomplete data vector of length n
ry
Vector of missing data pattern (FALSE -- missing, TRUE -- observed)
x
Matrix (n x p) of complete covariates.
type
type=1 -- variable is used as a predictor,

type=4 -- create interactions with the specified variable with all other predictors,

type=5 -- create a quadratic term of the specified variable

type=6 -- if some interactions are specified, ignore the variables with entry 6 when creating interactions

type=-2 -- specification of a cluster variable. The cluster mean of the outcome y (when eliminating the subject under study) is included as a further predictor in the imputation.

pls.facs
Number of factors used in PLS regression. This argument can also be specified as a list defining different numbers of factors for all variables to be imputed.
pls.impMethod
Imputation method used for in PLS estimation. Any imputation method can be used except if imputationWeights is provided. Then, only norm and pmm are available. The method xplsfacs creates only PLS factors of the regression model.
pls.impMethodArgs
Arguments for imputation method pls.impMethod.
pls.print.progress
Print progress during PLS regression.
imputationWeights
Vector of sample weights to be used in imputation models.
pcamaxcols
Maximum number of principal components.
min.int.cor
Minimum absolute correlation for an interaction of two predictors to be included in the PLS regression model
min.all.cor
Minimum absolute correlation for inclusion in the PLS regression model.
N.largest
Number of variable to be included which do have the largest absolute correlations.
pls.title
Title for progress print in console output.
print.dims
An optional logical indicating whether dimensions of inputs should be printed.
pls.maxcols
Maximum number of interactions to be created.
envir_pos
Position of the environment from which the data should be extracted.
extract_data
Logical indicating whether input data should be extracted from parent environment within mice::mice routine
...
Further arguments to be passed.

Value

A vector of length nmis=sum(!ry) with imputations if pls.impMethod != "xplsfacs". In case of pls.impMethod == "xplsfacs" a matrix with PLS factors is computed.

References

Mevik, B. H., & Wehrens, R. (2007). The pls package: Principal component and partial least squares regression in R. Journal of Statistical Software, 18, 1-24.

Examples

Run this code
## Not run: 
# #############################################################################
# # EXAMPLE 1: PLS imputation method for internet data
# #############################################################################	
# 
# data(data.internet)
# dat <- data.internet
# 
# # specify predictor matrix
# predictorMatrix <- matrix( 1 , ncol(dat) , ncol(dat) )
# rownames(predictorMatrix) <- colnames(predictorMatrix) <- colnames(dat)
# diag( predictorMatrix) <- 0
# 
# # use PLS imputation method for all variables
# impMethod <- rep( "pls" , ncol(dat) )
# names(impMethod) <- colnames(dat)
# 
# # define predictors for interactions (entries with type 4 in predictorMatrix)
# predictorMatrix[c("IN1","IN15","IN16"),c("IN1","IN3","IN10","IN13")] <- 4
# # define predictors which should appear as linear and quadratic terms (type 5)
# predictorMatrix[c("IN1","IN8","IN9","IN10","IN11"),c("IN1","IN2","IN7","IN5")] <- 5
# 
# # use 9 PLS factors for all variables
# pls.facs <- as.list( rep( 9 , length(impMethod) ) )
# names(pls.facs) <- names(impMethod)
# pls.facs$IN1 <- 15   # use 15 PLS factors for variable IN1
# 
# # choose norm or pmm imputation method
# pls.impMethod <- as.list( rep("norm" , length(impMethod) ) )
# names(pls.impMethod) <- names(impMethod)
# pls.impMethod[ c("IN1","IN6")] <- "pmm5"   
# 
# # some arguments for imputation method  
# pls.impMethodArgs <- list( "IN1" = list( "donors" = 10 ) , 
#                            "IN2" = list( "ridge2" = 1E-4 ) )
# 
# # Model 1: Three parallel chains
# imp1 <- mice::mice(data = dat , imputationMethod = impMethod ,  
#      m=3 , maxit=5 , predictorMatrix = predictorMatrix ,
#      pls.facs = pls.facs , # number of PLS factors
#      pls.impMethod = pls.impMethod ,  # Imputation Method in PLS imputation
#      pls.impMethodArgs = pls.impMethodArgs , # arguments for imputation method
#      pls.print.progress = TRUE )
# summary(imp1)
# 
# # Model 2: One long chain
# imp2 <- mice.1chain(data = dat , imputationMethod = impMethod ,  
#      burnin=10 , iter=21 , Nimp=3 , predictorMatrix = predictorMatrix ,
#      pls.facs = pls.facs , pls.impMethod = pls.impMethod ,
#      pls.impMethodArgs = pls.impMethodArgs )
# summary(imp2)
# ## End(Not run)

Run the code above in your browser using DataLab