kNN
is used to perform k-nearest neighbour classification for test set using training set. For each row of the test set, the k
nearest (based on Euclidean distance) training set vectors are found. Then, the classification is done by majority vote (ties broken at random). This function provides a formula interface to the knn
function of R
package class
. In addition, it allows normalization of the given data using the transform
function.
kNN( formula, train, test, k = 1, transform = FALSE, l = 0, prob = FALSE,
use.all = TRUE, na.rm = FALSE )
a formula, with a response but no interaction terms. For the case of data frame, it is taken as the model frame (see model.frame)
.
data frame or matrix of train set cases.
data frame or matrix of test set cases.
number of neighbours considered.
a character with options FALSE
(default), "minmax"
, and "zscore"
.
Option "minmax"
means no transformation. This option allows the users to use normalized version of the train and test sets for the kNN aglorithm.
minimum vote for definite decision, otherwise doubt
. (More
precisely, less than k-l
dissenting votes are allowed, even if k
is increased by ties.)
If this is true, the proportion of the votes for the winning class
are returned as attribute prob
.
controls handling of ties. If true, all distances equal to the k
th largest are included. If false, a random selection of distances equal to the k
th is chosen to use exactly k
neighbours.
a logical value indicating whether NA values in x
should be stripped before the computation proceeds.
Factor of classifications for the test set, in which the doubt
will be returned as NA
; basically, the return value is the same as in the knn
function of R
package class
.
Ripley, B. D. (1996) Pattern Recognition and Neural Networks. Cambridge.
Venables, W. N. and Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth edition. Springer.
# NOT RUN {
data( risk )
train = risk[ 1:100, ]
test = risk[ 101, ]
kNN( risk ~ income + age, train = train, test = test )
# }
Run the code above in your browser using DataLab