pmml.neighbr: Generate PMML for a neighbr object from the neighbr package.

Description

Generate PMML for a neighbr object from the neighbr package.

Usage

# S3 method for neighbr
pmml(model, model.name = "kNN_model",
  app.name = "Rattle/PMML", description = "K Nearest Neighbors Model",
  copyright = NULL, transforms = NULL, unknownValue = NULL, ...)

Arguments

model

a neighbr object.

model.name

a name to be given to the model in the PMML code.

app.name

the name of the application that generated the PMML code.

description

a descriptive text for the Header element of the PMML code.

the copyright notice for the model.

transforms

data transformations represented in PMML via pmmlTransformations.

unknownValue

value to be used as the 'missingValueReplacement' attribute for all MiningFields.

...

further arguments passed to or from other methods.

Value

PMML representation of the neighbr object.

Details

The model is represented in the PMML NearestNeighborModel format.

The current version of this converter does not support transformations (transforms must be left as NULL), sets categoricalScoringMethod to "majorityVote", sets continuousScoringMethod to "average", and isTransoformed to "false".

Examples

Run this code

# NOT RUN {
# continuous features with continuous target, categorical target,
# and neighbor ranking

# }
# NOT RUN {
library(neighbr)
data(iris)

# add an ID column to the data for neighbor ranking
iris$ID <- c(1:150)

# train set contains all predicted variables, features, and ID column
train_set <- iris[1:140,]

# omit predicted variables or ID column from test set
test_set <- iris[141:150,-c(4,5,6)]

fit <- knn(train_set=train_set,test_set=test_set,
           k=3,
           categorical_target="Species",
           continuous_target= "Petal.Width",
           comparison_measure="squared_euclidean",
           return_ranked_neighbors=3,
           id="ID")

pmml(fit)


# logical features with categorical target and neighbor ranking

library(neighbr)
data("houseVotes84")

# remove any rows with N/A elements
dat <- houseVotes84[complete.cases(houseVotes84),]

# change all {yes,no} factors to {0,1}
feature_names <- names(dat)[!names(dat) %in% c("Class","ID")]
for (n in feature_names) {
  levels(dat[,n])[levels(dat[,n])=="n"] <- 0
  levels(dat[,n])[levels(dat[,n])=="y"] <- 1
}

# change factors to numeric
for (n in feature_names) {dat[,n] <- as.numeric(levels(dat[,n]))[dat[,n]]}

# add an ID column for neighbor ranking
dat$ID <- c(1:nrow(dat))

# train set contains features, predicted variable, and ID
train_set <- dat[1:225,]

# test set contains features only
test_set <- dat[226:232,!names(dat) %in% c("Class","ID")]

fit <- knn(train_set=train_set,test_set=test_set,
           k=5,
           categorical_target = "Class",
           comparison_measure="jaccard",
           return_ranked_neighbors=3,
           id="ID")

pmml(fit)
# }
# NOT RUN {
# }