Perform imputation of missing data in a data frame using the k-Nearest Neighbour algorithm. For discrete variables we use the mode, for continuous variables the median value is instead taken.
knn.impute(data, k = 10, cat.var = 1:ncol(data), to.impute = 1:nrow(data), using = 1:nrow(data))
a data frame
number of neighbours to be used; for categorical variables the mode of the neighbours is used, for continuous variables the median value is used instead. Default: 10.
vector containing the indices of the variables to be considered as categorical. Default: all variables.
vector indicating which rows of the dataset are to be imputed. Default: impute all rows.
vector indicating which rows of the dataset are to be used to search for neighbours. Default: use all rows.
imputed data frame.