Last chance! 50% off unlimited learning
Sale ends in
knn.dist
.knn.predict(train, test, y, dist.matrix, k=1, agg.meth=if (is.factor(y)) "majority" else "mean", ties.meth="min")
dist.matrix
to use as training set.dist.matrix
to use as test set.knn.dist
."min"
which uses all ties, alternatives include "max"
which uses none if there are ties for the k-th nearest neighbor, "random"
which selects among the ties randomly and "first"
which uses the ties in their order in the data.k
may be specified to be any positive integer less than the number of training cases, but is generally between 1 and 10.
The indexes for the training and test cases are in reference to the order of the entire data set as it was passed to knn.dist
.
The aggregation may be any named function. By default, classification (factored responses) will use the majority
class function and non-factored responses will use mean
. Other options to consider include min
, max
, and median
.
The ties are handled using the rank
function. Further information may be found by examining the ties.method
there.
knn.dist
,dist
#a quick classification example
x1 <- c(rnorm(20, mean=1), rnorm(20, mean=5))
x2 <- c(rnorm(20, mean=5), rnorm(20, mean=1))
y=rep(1:2,each=20)
x <- cbind(x1,x2)
train <- sample(1:40, 30)
#plot the training cases
plot(x1[train], x2[train], col=y[train]+1)
#predict the other cases
test <- (1:40)[-train]
kdist <- knn.dist(x)
preds <- knn.predict(train, test, y ,kdist, k=3, agg.meth="majority")
#add the predictions to the plot
points(x1[test], x2[test], col=as.integer(preds)+1, pch="+")
#display the confusion matrix
table(y[test], preds)
Run the code above in your browser using DataLab