Generate the PMML representation of an svm object from the e1071 package.
# S3 method for svm
pmml(model, model.name = "LIBSVM_Model",
app.name = "R-PMML", description = "Support Vector Machine Model",
copyright = NULL, transforms = NULL, unknownValue = NULL,
dataset = NULL, ...)
an svm object from package e1071.
a name to be given to the model in the PMML code.
the name of the application that generated the PMML code.
a descriptive text for the Header element of the PMML code.
the copyright notice for the model.
data transformations represented in PMML via pmmlTransformations.
value to be used as the 'missingValueReplacement' attribute for all MiningFields.
required for one-classification only; data used to train one-class SVM model.
further arguments passed to or from other methods.
PMML representation of the svm object.
The model is represented in the PMML SupportVectorMachineModel format.
Note that the sign of the coefficient of each support vector flips between the R object and the exported PMML file for classification and regression models. This is due to the minor difference in the training/scoring formula between the LIBSVM algorithm and the DMG specification. Hence the output value of each support vector machine has a sign flip between the DMG definition and the svm prediction function.
In a classification model, even though the output of the support vector machine has a sign flip, it does not affect the final predicted category. This is because in the DMG definition, the winning category is defined as the left side of threshold 0 while the LIBSVM defines the winning category as the right side of threshold 0.
For a regression model, the exported PMML code has two OutputField elements. The OutputField
predictedValue
shows the support vector machine output per DMG definition. The OutputField
svm_predict_function
gives the value corresponding to the R predict function for the svm
model. This output should be used when making model predictions.
For a one-classification svm (OCSVM) model, the PMML has three OutputField elements. The
OutputField anomaly
is a boolean value that conforms to the DMG definition of an
anomaly detection model; this value is TRUE
when an anomaly is detected. This value is
the opposite of the prediction by the e1071 object, which predicts FALSE when an anomaly
is detected; that is, the R svm model predicts whether an input is an inlier. The OutputField
anomalyScore
is the signed distance to the separating boundary; anomalyScore
corresponds
to the decision.values
attribute of the output of the svm predict function in R.
For example, say that for an input of observations, the R OCSVM model predicts a positive
decision value of 0.4
and label of TRUE
According to the R object, this means
that the observation is an inlier. The PMML export of this model will give the following for the
same input: anomalyScore = 0.4
, anomaly = "false"
. According to the PMML, the
observation is not an anomaly. Note that there is no sign flip between R and PMML for OCSVM models.
To export a OCSVM model, an additional argument, dataset
, is required by the function.
This argument expects a dataframe with data that was used to train the model. This is
necessary because for one-class svm, the R svm object does not contain information about
the data types of the features used to train the model. The exporter does not yet support
the formula interface for one-classification models, so the default S3 method must be used
to train the SVM. The data used to train the one-class SVM must be numeric and not of
integer class.
Anomaly detection SVM models are not yet supported by DMG PMML schema version 4.3. The PMML produced by this exporter uses an extended schema (4.3Ext), and can be consumed by Zementis products.
* R project CRAN package: e1071: Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien https://CRAN.R-project.org/package=e1071
* Chang, Chih-Chung and Lin, Chih-Jen, LIBSVM: a library for Support Vector Machines http://www.csie.ntu.edu.tw/~cjlin/libsvm
# NOT RUN {
# }
# NOT RUN {
library(e1071)
data(iris)
# Classification with a polynomial kernel
fit <- svm(Species ~ ., data=iris, kernel="polynomial")
pmml(fit)
# Regression
fit <- svm(Sepal.Length ~ Sepal.Width + Petal.Length + Petal.Width,data=iris)
pmml(fit)
# Anomaly detection with one-classification
fit <- svm(iris[,1:4],y=NULL,type='one-classification')
pmml(fit,dataset=iris[,1:4])
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab