Learn R Programming

gbts (version 1.0.1)

comperf: Compute model performance

Description

This function computes model performance given a vector of response values and a vector of predictions.

Usage

comperf(y, yhat, w = rep(1, length(y)), pfmc = NULL, cdfx = "fpr", cdfy = "tpr", cutoff = 0.5)

Arguments

y
a vector of numeric response values.
yhat
a vector of model predictions.
w
an optional vector of observation weights.
pfmc
a character of the performance metric to be computed. For binary classification, pfmc accepts:
  • "acc": accuracy.
  • "dev": deviance.
  • "ks": Kolmogorov-Smirnov (KS) statistic.
  • "auc": area under the ROC curve. The default ROC curve is given by true positive rate (on the y-axis) vs. false positive rate (on the x-axis). A different curve can be obtained by setting the cdfx and cdfy arguments described below.
  • "roc": ROC curve given by true positive rate vs. false positive rate (default). A different curve can be obtained by setting the cdfx and cdfy arguments described below. If input to the argument cutoff is missing (default), the return value is a list of two components x and y representing the ROC curve. Otherwise, the return value is a single or a vector of evaluation(s) of the ROC curve at the cutoff.

For regression, pfmc accepts:

  • "mse": mean squared error.
  • "mae": mean absolute error.
  • "rsq": r-squared (coefficient of determination).

cdfx
a character of the cumulative distribution for the x-axis. Supported values are
  • "fpr": false positive rate.
  • "fnr": false negative rate.
  • "rpp": rate of positive prediction.
cdfy
a character of the cumulative distribution for the y-axis. Supported values are
  • "tpr": true positive rate.
  • "tnr": true negative rate.
cutoff
a value in [0, 1] used for binary classification. If pfmc="acc", negative prediction has predicted probability <= cutoff and positive prediction has predicted probability > cutoff. If pfmc="roc", then this is used in conjunction with the cdfx and cdfy arguments (described above) which specify the cumulative distributions for the x-axis and y-axis of the ROC curve. For example, if the desired performance metric is the true positive rate at the 5% false positive rate, specify pfmc="roc", cdfx="fpr", cdfy="tpr", and cutoff=0.05.

Value

A single or a vector of numeric values of model performance, or a list of two components x and y representing the ROC curve.

See Also

gbts, predict.gbts

Examples

Run this code
y = c(0, 1, 0, 1, 1, 1)
yhat = c(0.5, 0.9, 0.2, 0.7, 0.6,  0.4)
comperf(y, yhat, pfmc = "auc")
# 0.875

y = 1:10
yhat = c(1:5 - 0.1, 6:10 + 0.1)
comperf(y, yhat, pfmc = "mse")
# 0.01

Run the code above in your browser using DataLab