comperf: Compute model performance

Description

This function computes model performance given a vector of response values and a vector of predictions.

Usage

comperf(y, yhat, w = rep(1, length(y)), pfmc = NULL, cdfx = "fpr", cdfy = "tpr", cutoff = 0.5)

Arguments

a vector of numeric response values.

yhat

a vector of model predictions.

an optional vector of observation weights.

pfmc

a character of the performance metric to be computed. For binary classification, pfmc accepts:

"acc": accuracy.
"dev": deviance.
"ks": Kolmogorov-Smirnov (KS) statistic.
"auc": area under the ROC curve. The default ROC curve is given by true positive rate (on the y-axis) vs. false positive rate (on the x-axis). A different curve can be obtained by setting the cdfx and cdfy arguments described below.
"roc": ROC curve given by true positive rate vs. false positive rate (default). A different curve can be obtained by setting the cdfx and cdfy arguments described below. If input to the argument cutoff is missing (default), the return value is a list of two components x and y representing the ROC curve. Otherwise, the return value is a single or a vector of evaluation(s) of the ROC curve at the cutoff.

For regression, pfmc accepts:

"mse": mean squared error.
"mae": mean absolute error.
"rsq": r-squared (coefficient of determination).

cdfx

a character of the cumulative distribution for the x-axis. Supported values are

"fpr": false positive rate.
"fnr": false negative rate.
"rpp": rate of positive prediction.

cdfy

a character of the cumulative distribution for the y-axis. Supported values are

"tpr": true positive rate.
"tnr": true negative rate.

cutoff

a value in [0, 1] used for binary classification. If pfmc="acc", negative prediction has predicted probability <= cutoff and positive prediction has predicted probability > cutoff. If pfmc="roc", then this is used in conjunction with the cdfx and cdfy arguments (described above) which specify the cumulative distributions for the x-axis and y-axis of the ROC curve. For example, if the desired performance metric is the true positive rate at the 5% false positive rate, specify pfmc="roc", cdfx="fpr", cdfy="tpr", and cutoff=0.05.

Value

A single or a vector of numeric values of model performance, or a list of two components x and y representing the ROC curve.

Examples

Run this code

y = c(0, 1, 0, 1, 1, 1)
yhat = c(0.5, 0.9, 0.2, 0.7, 0.6,  0.4)
comperf(y, yhat, pfmc = "auc")
# 0.875

y = 1:10
yhat = c(1:5 - 0.1, 6:10 + 0.1)
comperf(y, yhat, pfmc = "mse")
# 0.01

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples