yaImpute (version 1.0-32)

impute.yai: Impute variables from references to targets

Description

Imputes the observation for variables from a reference observation to a target observation. Also, imputes a value for a reference from other references. This practice is useful for validation (see yai). Variables not available in the original data may be imputed using argument ancillaryData.

Usage

# S3 method for yai
impute(object,ancillaryData=NULL,method="closest",
       method.factor=method,k=NULL,vars=NULL,
       observed=TRUE,...)

Value

An object of class c("impute.yai","data.frame"), with rownames identifying observations and column names identifying variables. When

observed=TRUE additional columns are created with a suffix of

.o.

NA's fill columns of observed values when no corresponding value is known, as in the case for Y-variables from

target observations.

Scale factors for each variable are returned as an attribute (see attributes).

Arguments

object

an object of class yai.

ancillaryData

a data frame of variables that may not have been used in the original call to yai. There must be one row for each reference observation, no missing data, and row names must match those used in the reference observations.

method

the method used to compute the imputed values for continuous variables, as follows:
closest: use the single neighbor that is closest (this is the default and is always used when k=1);
mean: the mean of the k neighbors is taken;
median: the median of the k neighbors is taken;
dstWeighted: a weighted mean is taken over the k neighbors where the weights are 1/(1+d).

method.factor

the method used to compute the imputed values for factors, as follows:
closest: use the single neighbor that is closest (this is the default and is always used when k=1);
mean or median: actually is the mode\-\-it is the factor level that occurs the most often among the k neighbors;
dstWeighted: a mode where the count is the sum of the weights (1/(1+d)) rather than each having a weight of 1.

k

the number neighbors to use in averages, when NULL all present are used.

vars

a character vector of variables to impute, when NULL, the behaviour depends on the value of ancillaryData: when it is NULL, the Y-variables are imputed and otherwise all present in ancillaryData are imputed.

observed

when TRUE, columns are created for observed values (those from the target observations) as well as imputed values (those from the reference observations.

...

passed to other methods, currently not used.

Author

Nicholas L. Crookston ncrookston.fs@gmail.com
Andrew O. Finley finleya@msu.edu
Emilie Henderson emilie.henderson@oregonstate.edu

See Also

yai

Examples

Run this code
require(yaImpute)

data(iris)

# form some test data
refs=sample(rownames(iris),50)
x <- iris[,1:3]      # Sepal.Length Sepal.Width Petal.Length
y <- iris[refs,4:5]  # Petal.Width Species

# build a yai object using mahalanobis
mal <- yai(x=x,y=y,method="mahalanobis")

# output a data frame of observed and imputed values
# of all variables and observations.

impute(mal)
malImp=impute(mal,ancillaryData=iris)
plot(malImp)

Run the code above in your browser using DataCamp Workspace