class.eval(trues, preds, stats=if (is.null(benMtrx)) c('err') else c('err','totU'), benMtrx=NULL, allCls=levels(factor(trues)))
benMtrx
contains a matrix with cost/benefits for all combinations of possible
predictions and true values, i.e. with dimension NC x NC, where NC is
the number of classes of the classification task being handled.
Both "acc" and "err" are related to the proportion of accurate predictions. They are calculated as: "acc": sum(I(t_i == p_i))/N, where t's are the true values and p's are the predictions, while I() is an indicator function given 1 if its argument is true and 0 otherwise. Note that "acc" is a value in the interval [0,1], 1 corresponding to all predictions being correct.
"err": = 1 - acc
Regards "totU" this is a metric that takes into consideration not only the fact that the predictions are correct or not, but also the costs or benefits of these predictions. As mentioned above it assumes that the user provides a fully specified matrix of costs and benefits, with benefits corresponding to correct predictions, i.e. where t_i == p_i, while costs correspond to erroneous predictions. These matrices are NC x NC square matrices, where NC is the number of possible values of the nominal target variable (i.e. the number of classes). The diagonal of these matrices corresponds to the correct predictions (t_i == p_i) and should have positive values (benefits). The positions outside of the diagonal correspond to prediction errors and should have negative values (costs). The "totU" measures the total Utility (sum of the costs and benefits) of the predictions of a classification model. It is calculated as:
"totU": sum(CB[t_i,p_i]) where CB is a cost/benefit matrix and CB[t_i,p_i] is the entry on this matrix corresponding to predicting class p_i for a true value of t_i.
regr.eval
## Calculating several statistics of a classification tree on the Iris data
data(iris)
idx <- sample(1:nrow(iris),100)
train <- iris[idx,]
test <- iris[-idx,]
tree <- rpartXse(Species ~ .,train)
preds <- predict(tree,test,type='class')
## Calculate the accuracy and error rate
class.eval(test$Species,preds)
## Now trying calculating the utility of the predictions
cbM <- matrix(c(10,-20,-20,-20,20,-10,-20,-10,20),3,3)
class.eval(test$Species,preds,"totU",cbM)
Run the code above in your browser using DataCamp Workspace