Last chance! 50% off unlimited learning
Sale ends in
gknn
is an implementation of the k-nearest neighbours algorithm making use of general distance measures. A formula interface is provided.
# S3 method for formula
gknn(formula, data = NULL, ..., subset, na.action = na.pass, scale = TRUE)
# S3 method for default
gknn(x, y, k = 1, method = NULL,
scale = TRUE, use_all = TRUE,
FUN = mean, ...)
# S3 method for gknn
predict(object, newdata,
type = c("class", "votes", "prob"),
...,
na.action = na.pass)
For gknn()
, an object of class "gknn"
containing the data and the specified parameters. For predict.gknn()
, a vector of predictions, or a matrix with votes for all classes. In case of an overall class tie, the predicted class is chosen by random.
a symbolic description of the model to be fit.
an optional data frame containing the variables in the model. By default the variables are taken from the environment which ‘gknn’ is called from.
a data matrix.
a response vector with one label for each row/component of
x
. Can be either a factor (for classification tasks)
or a numeric vector (for regression).
number of neighbours considered.
a logical vector indicating the variables to be
scaled. If scale
is of length 1, the value is recycled as
many times as needed.
By default, numeric matrices are scaled to zero mean and unit variance. The center and scale
values are returned and used for later predictions.
Note that the default metric for data frames is the Gower metric
which standardizes the values to the unit interval.
Argument passed to dist()
from the proxy
package to select the distance metric used: a function, or a mnemonic string referencing the distance measure. Defaults to "Euclidean"
for metric matrices, to "Jaccard"
for logical matrices and to "Gower"
for data frames.
controls handling of ties. If true, all distances equal to the kth largest are included. If false, a random selection of distances equal to the kth is chosen to use exactly k neighbours.
function used to aggregate the k nearest target values in case of regression.
object of class gknn
.
matrix or data frame with new instances.
character specifying the return type in case of class
predictions: for "class"
, the class labels; for "prob"
, the class distribution for all k neighbours considered; for "votes"
, the raw counts.
additional parameters passed to dist()
An index vector specifying the cases to be used in the training sample. (NOTE: If given, this argument must be named.)
A function to specify the action to be taken if NA
s are
found. The default action is na.pass
. (NOTE: If given, this argument must be named.)
David Meyer (David.Meyer@R-project.org)
dist
(in package proxy)
data(iris)
model <- gknn(Species ~ ., data = iris)
predict(model, iris[c(1, 51, 101),])
test = c(45:50, 95:100, 145:150)
model <- gknn(Species ~ ., data = iris[-test,], k = 3, method = "Manhattan")
predict(model, iris[test,], type = "votes")
model <- gknn(Species ~ ., data = iris[-test], k = 3, method = "Manhattan")
predict(model, iris[test,], type = "prob")
Run the code above in your browser using DataLab