Learn R Programming

emil (version 1.1-6)

pre.impute.knn: kNN imputation

Description

Nearest neighbor methods needs to have a distance matrix of the dataset it works on. When doing repeated model fittings on subsets of the entire dataset it is unnecessary to recalculate it every time, therefore this function requires the user to manually calculate it prior to resampling and supply it in a wrapper function.

Usage

pre.impute.knn(x, y, fold, k = 0.05, distmat)

Arguments

x
Dataset.
y
Response vector.
fold
A logical vector with FALSE for fitting observations, TRUE for test observations and NA for observations not to be included.
k
Number of nearest neighbors to calculate mean from. Set to < 1 to specify a fraction.
distmat
Distance matrix. A matrix, dist object or "auto". Notice that "auto" will recalculate the distance matrix in each fold, which is only meaningful in case the features of x<

Examples

Run this code
x <- iris[-5]
x[sample(nrow(x), 30), 3] <- NA
my.dist <- dist(x)
evaluate.modeling(modeling.procedure("lda"), x=x, y=iris$Species,
    pre.process=function(...) pre.impute.knn(..., k=4, my.dist))

Run the code above in your browser using DataLab