modelEval: Statistical evaluation of predictions

Description

Using predictions of given model produced by predict.CoreModel and correct labels, computes some statistics evaluating the quality of the model.

Usage

modelEval(model=NULL, correctClass, predictedClass, 
          predictedProb=NULL, costMatrix=NULL, 
          priorClProb = NULL, avgTrainPrediction = NULL, beta = 1)

Arguments

model

The model structure as returned by CoreModel, or NULL if some other predictions are evaluated.

correctClass

A vector of correct class labels for classification problem and function values for regression problem.

predictedClass

A vector of predicted class labels for classification problem and function values for regression problem.

predictedProb

An optional matrix of predicted class probabilities for classification.

costMatrix

Optional cost matrix can provide nonuniform costs for classification problems.

priorClProb

If model=NULL a vector of prior class probabilities shall be provided in case of classification.

avgTrainPrediction

If model=NULL mean of prediction values on training set shall be provided in case of regression.

beta

For two class problems beta controls the relative importance of precision and recall in F-measure.

Value

For classification problem function returns list with the components

accuracy

classification accuracy, for two class problems this would equal $$\rm{accuracy}=\frac{TP+TN}{TP+FN+FP+TN}$$

averageCost

average classification cost

informationScore

information score statistics measuring information contents in the predicted probabilities

AUC

Area under the ROC curve

predictionMatrix

matrix of miss-classifications also confusion matrix

sensitivity

sensitivity for two class problems (also called accuracy of the positive class, i.e., acc+, or true positive rate), $$rm{sensitivity} = \frac{TP}{TP+FN}$$

specificity

specificity for two class problems (also called accuracy of the negative class, i.e., acc-, or true negative rate), $$\rm{specificity} = \frac{TN}{TN+FP}$$

brierScore

Brier score of predicted probabilities (the original Brier's definition which scores all the classes not only the correct one)

kappa

Cohen's kappa statistics measuring randomness of the predictions; for perfect predictions kappa=1, for completely random predictions kappa=0

precision

precision for two class problems $$\rm{precision} = \frac{TP}{TP+FP}$$

recall

recall for two class problems (the same as sensitivity)

F-measure

F-measure giving a weighted score of precision and recall for two class problems $$F= \frac{(1+\beta^2)\cdot \rm{recall} \cdot \rm{precision}}{\beta^2 \cdot \rm{recall} + \rm{precision}}$$

G-mean

geometric mean of positive and negative accuracy, $$G=\sqrt{\rm{senstivity} \cdot \rm{specificity}} $$

Kolmogorov-Smirnov statistics defined for binary classification problems, reports the distance between the probability distributions of positive class for positive and negative instances, see (Hand, 2005), value 0 means no separation, and value 1 means perfect separation, $$KS = \max_t |TPR(t)-FPR(t)|$$ see definitions of TPR and FPR below

TPR

true positive rate $TPR = \frac{TP}{TP+FN}$ at maximal value of KS statistics

FPR

false positive rate $FPR = \frac{FP}{FP+TN}$ at maximal value of KS statistics

For regression problem the returned list has components

MSE

square root of Mean Squared Error

RMSE

Relative Mean Squared Error

MAE

Mean Absolute Error

RMAE

Relative Mean Absolute Error

Details

The function uses the model structure as returned by CoreModel, predictedClass and predictedProb returned by predict.CoreModel. Predicted values are compared with true values and some statistics are computed measuring the quality of predictions. In classification only one of the predictedClass and predictedProb can be NULL (one of them is computed from the other under assumption that class label is assigned to the most probable class). Some of the returned statistics are defined only for two class problems, for which the confusion matrix specifying the number of instances of true/predicted class is defined as follows,

true/predicted class	positive	negative
positive	true positive (TP)	false negative (FN)

Optional cost matrix can provide nonuniform costs for classification problems. For regression problem this parameter is ignored. The costs can be different from the ones used for building the model in CoreModel and prediction with the model in predict.CoreModel. If no costs are supplied, uniform costs are assumed. The format of the matrix is costMatrix(true_class, predicted_class). By default a uniform costs are assumed, i.e., costMatrix(i, i) = 0, and costMatrix(i, j) = 1, for i not equal to j. See the example below.

If a non-CORElearn model is evaluated, one should set model=NULL, and a vector of prior of class probabilities priorClProb shall be provided in case of classification, and in case of regression avgTrainPrediction shall be the mean of prediction values (estimated on a e.g., training set).

References

Igor Kononenko, Matjaz Kukar: Machine Learning and Data Mining: Introduction to Principles and Algorithms. Horwood, 2007

David J.Hand: Good practice in retail credit scorecard assesment. Journal of Operational Research Society, 56:1109-1117, 2005)

Examples

Run this code

# NOT RUN {
# use iris data

# build random forests model with certain parameters
model <- CoreModel(Species ~ ., iris, model="rf", 
              selectionEstimator="MDL",minNodeWeightRF=5,
              rfNoTrees=100, maxThreads=1)

# prediction with node distribution
pred <- predict(model, iris, rfPredictClass=FALSE)

# Model evaluation
mEval <- modelEval(model, iris[["Species"]], pred$class, pred$prob)
print(mEval)

# use nonuniform cost matrix
noClasses <- length(levels(iris[["Species"]]))
costMatrix <- 1 - diag(noClasses)
costMatrix[3,1] <- costMatrix[3,2] <- 5 # assume class 3 is more valuable  
mEvalCost <- modelEval(model, iris[["Species"]], pred$class, pred$prob, 
                       costMatrix=costMatrix)
print(mEvalCost)

destroyModels(model) # clean up

# }

Run the code above in your browser using DataLab