pmml.svm: Generate the PMML representation of an svm object from the e1071 package.

Description

Generate the PMML representation of an svm object from the e1071 package.

Usage

# S3 method for svm
pmml(model, model.name = "LIBSVM_Model",
  app.name = "R-PMML", description = "Support Vector Machine Model",
  copyright = NULL, transforms = NULL, unknownValue = NULL,
  dataset = NULL, ...)

Arguments

model

an svm object from package e1071.

model.name

a name to be given to the model in the PMML code.

app.name

the name of the application that generated the PMML code.

description

a descriptive text for the Header element of the PMML code.

the copyright notice for the model.

transforms

data transformations represented in PMML via pmmlTransformations.

unknownValue

value to be used as the 'missingValueReplacement' attribute for all MiningFields.

dataset

required for one-classification only; data used to train one-class SVM model.

...

further arguments passed to or from other methods.

Value

PMML representation of the svm object.

Details

The model is represented in the PMML SupportVectorMachineModel format.

Note that the sign of the coefficient of each support vector flips between the R object and the exported PMML file for classification and regression models. This is due to the minor difference in the training/scoring formula between the LIBSVM algorithm and the DMG specification. Hence the output value of each support vector machine has a sign flip between the DMG definition and the svm prediction function.

In a classification model, even though the output of the support vector machine has a sign flip, it does not affect the final predicted category. This is because in the DMG definition, the winning category is defined as the left side of threshold 0 while the LIBSVM defines the winning category as the right side of threshold 0.

For a regression model, the exported PMML code has two OutputField elements. The OutputField predictedValue shows the support vector machine output per DMG definition. The OutputField svm_predict_function gives the value corresponding to the R predict function for the svm model. This output should be used when making model predictions.

For a one-classification svm (OCSVM) model, the PMML has three OutputField elements. The OutputField anomaly is a boolean value that conforms to the DMG definition of an anomaly detection model; this value is TRUE when an anomaly is detected. This value is the opposite of the prediction by the e1071 object, which predicts FALSE when an anomaly is detected; that is, the R svm model predicts whether an input is an inlier. The OutputField anomalyScore is the signed distance to the separating boundary; anomalyScore corresponds to the decision.values attribute of the output of the svm predict function in R.

For example, say that for an input of observations, the R OCSVM model predicts a positive decision value of 0.4 and label of TRUE According to the R object, this means that the observation is an inlier. The PMML export of this model will give the following for the same input: anomalyScore = 0.4, anomaly = "false". According to the PMML, the observation is not an anomaly. Note that there is no sign flip between R and PMML for OCSVM models.

To export a OCSVM model, an additional argument, dataset, is required by the function. This argument expects a dataframe with data that was used to train the model. This is necessary because for one-class svm, the R svm object does not contain information about the data types of the features used to train the model. The exporter does not yet support the formula interface for one-classification models, so the default S3 method must be used to train the SVM. The data used to train the one-class SVM must be numeric and not of integer class.

Anomaly detection SVM models are not yet supported by DMG PMML schema version 4.3. The PMML produced by this exporter uses an extended schema (4.3Ext), and can be consumed by Zementis products.

References

* R project CRAN package: e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien https://CRAN.R-project.org/package=e1071

* Chang, Chih-Chung and Lin, Chih-Jen, LIBSVM: a library for Support Vector Machines http://www.csie.ntu.edu.tw/~cjlin/libsvm

Examples

Run this code

# NOT RUN {
# }
# NOT RUN {
library(e1071)
data(iris)

# Classification with a polynomial kernel
fit <- svm(Species ~ ., data=iris, kernel="polynomial")
pmml(fit)

# Regression
fit <- svm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width,data=iris)
pmml(fit)

# Anomaly detection with one-classification
fit <- svm(iris[,1:4],y=NULL,type='one-classification')
pmml(fit,dataset=iris[,1:4])

# }
# NOT RUN {
# }

Run the code above in your browser using DataLab