Compute model performance statistics for a fitted Weka classifier.
evaluate_Weka_classifier(object, newdata = NULL, cost = NULL,
numFolds = 0, complexity = FALSE,
class = FALSE, seed = NULL, ...)
a Weka_classifier
object.
an optional data frame in which to look for variables
with which to evaluate. If omitted or NULL
, the training
instances are used.
a square matrix of (mis)classification costs.
the number of folds to use in cross-validation.
option to include entropy-based statistics.
option to include class statistics.
optional seed for cross-validation.
further arguments passed to other methods (see details).
An object of class Weka_classifier_evaluation
, a list of the
following components:
character, concatenation of the string representations of the performance statistics.
vector, base statistics, e.g., the percentage of instances correctly classified, etc.
vector, entropy-based statistics (if selected).
matrix, class statistics, e.g., the true positive rate, etc., for each level of the response variable (if selected).
table, cross-classification of true and predicted classes.
The function computes and extracts a non-redundant set of performance statistics that is suitable for model interpretation. By default the statistics are computed on the training data.
Currently argument …
only supports the logical variable
normalize
which tells Weka to normalize the cost matrix so that
the cost of a correct classification is zero.
Note that if the class variable is numeric only a subset of the statistics
are available. Arguments complexity
and class
are then
not applicable and therefore ignored.
I. H. Witten and E. Frank (2005). Data Mining: Practical Machine Learning Tools and Techniques. 2nd Edition, Morgan Kaufmann, San Francisco.
# NOT RUN { ## Use some example data. w <- read.arff(system.file("arff","weather.nominal.arff", package = "RWeka")) ## Identify a decision tree. m <- J48(play~., data = w) m ## Use 10 fold cross-validation. e <- evaluate_Weka_classifier(m, cost = matrix(c(0,2,1,0), ncol = 2), numFolds = 10, complexity = TRUE, seed = 123, class = TRUE) e summary(e) e$details # }