RWeka (version 0.4-40)

evaluate_Weka_classifier: Model Statistics for R/Weka Classifiers

Description

Compute model performance statistics for a fitted Weka classifier.

Usage

evaluate_Weka_classifier(object, newdata = NULL, cost = NULL, 
                         numFolds = 0, complexity = FALSE,
                         class = FALSE, seed = NULL, ...)

Arguments

object

a Weka_classifier object.

newdata

an optional data frame in which to look for variables with which to evaluate. If omitted or NULL, the training instances are used.

cost

a square matrix of (mis)classification costs.

numFolds

the number of folds to use in cross-validation.

complexity

option to include entropy-based statistics.

class

option to include class statistics.

seed

optional seed for cross-validation.

further arguments passed to other methods (see details).

Value

An object of class Weka_classifier_evaluation, a list of the following components:

string

character, concatenation of the string representations of the performance statistics.

details

vector, base statistics, e.g., the percentage of instances correctly classified, etc.

detailsComplexity

vector, entropy-based statistics (if selected).

detailsClass

matrix, class statistics, e.g., the true positive rate, etc., for each level of the response variable (if selected).

confusionMatrix

table, cross-classification of true and predicted classes.

Details

The function computes and extracts a non-redundant set of performance statistics that is suitable for model interpretation. By default the statistics are computed on the training data.

Currently argument only supports the logical variable normalize which tells Weka to normalize the cost matrix so that the cost of a correct classification is zero.

Note that if the class variable is numeric only a subset of the statistics are available. Arguments complexity and class are then not applicable and therefore ignored.

References

I. H. Witten and E. Frank (2005). Data Mining: Practical Machine Learning Tools and Techniques. 2nd Edition, Morgan Kaufmann, San Francisco.

Examples

Run this code
# NOT RUN {
## Use some example data.
w <- read.arff(system.file("arff","weather.nominal.arff", 
	       package = "RWeka"))

## Identify a decision tree.
m <- J48(play~., data = w)
m

## Use 10 fold cross-validation.
e <- evaluate_Weka_classifier(m,
                              cost = matrix(c(0,2,1,0), ncol = 2),
                              numFolds = 10, complexity = TRUE,
                              seed = 123, class = TRUE)
e
summary(e)
e$details
# }

Run the code above in your browser using DataCamp Workspace