knn.impute: Perform imputation of a data frame using k-NN.
Description
Perform imputation of missing data in a data frame using the k-Nearest Neighbour algorithm.
For discrete variables we use the mode, for continuous variables the median value is instead taken.
Usage
knn.impute(data, k = 10, cat.var = 1:ncol(data), to.impute = 1:nrow(data), using = 1:nrow(data))
Arguments
data
a data frame
k
number of neighbours to be used; for categorical variables
the mode of the neighbours is used, for continuous variables
the median value is used instead. Default: 10.
cat.var
vector containing the indices of the variables to be
considered as categorical. Default: all variables.
to.impute
vector indicating which rows of the dataset are to be imputed.
Default: impute all rows.
using
vector indicating which rows of the dataset are to be used
to search for neighbours. Default: use all rows.