For classification models, this function creates a 'calibration plot' that describes how consistent model probabilities are with observed event rates.

`calibration(x, ...)`# S3 method for default
calibration(x, ...)

# S3 method for formula
calibration(
x,
data = NULL,
class = NULL,
cuts = 11,
subset = TRUE,
lattice.options = NULL,
...
)

# S3 method for calibration
print(x, ...)

# S3 method for calibration
xyplot(x, data = NULL, ...)

# S3 method for calibration
ggplot(data, ..., bwidth = 2, dwidth = 3)

x

a `lattice`

formula (see `xyplot`

for syntax) where the left
-hand side of the formula is a factor class variable of the observed outcome and the right-hand side
specifies one or model columns corresponding to a numeric ranking variable for a model (e.g. class
probabilities). The classification variable should have two levels.

…

options to pass through to `xyplot`

or the panel function (not
used in `calibration.formula`

).

data

For `calibration.formula`

, a data frame (or more precisely, anything that is a valid
`envir`

argument in `eval`

, e.g., a list or an environment) containing values for any
variables in the formula, as well as `groups`

and `subset`

if applicable. If not found in
`data`

, or if `data`

is unspecified, the variables are looked for in the environment of the
formula. This argument is not used for `xyplot.calibration`

. For ggplot.calibration, `data`

should be an object of class "`calibration`

"."

class

a character string for the class of interest

cuts

If a single number this indicates the number of splits of the data are used to create the
plot. By default, it uses as many cuts as there are rows in `data`

. If a vector, these are the
actual cuts that will be used.

subset

An expression that evaluates to a logical or integer indexing vector. It is evaluated in
`data`

. Only the resulting rows of `data`

are used for the plot.

lattice.options

A list that could be supplied to `lattice.options`

bwidth, dwidth

a numeric value for the confidence interval bar width and dodge width, respectively. In the latter case, a dodge is only used when multiple models are specified in the formula.

`calibration.formula`

returns a list with elements:

the data used for plotting

the number of cuts

the event class

the names of the model probabilities

xyplot.calibration returns a lattice object

`calibration.formula`

is used to process the data and `xyplot.calibration`

is used to create the plot.

To construct the calibration plot, the following steps are used for each model:

The data are split into

`cuts - 1`

roughly equal groups by their class probabilitiesthe number of samples with true results equal to

`class`

are determinedthe event rate is determined for each bin

`xyplot.calibration`

produces a plot of the observed event rate by the mid-point of the bins.

This implementation uses the lattice function `xyplot`

, so plot
elements can be changed via panel functions, `trellis.par.set`

or
other means. `calibration`

uses the panel function `panel.calibration`

by default, but
it can be changed by passing that argument into `xyplot.calibration`

.

The following elements are set by default in the plot but can be changed by passing new values into
`xyplot.calibration`

: `xlab = "Bin Midpoint"`

, `ylab = "Observed Event Percentage"`

,
`type = "o"`

, `ylim = extendrange(c(0, 100))`

,`xlim = extendrange(c(0, 100))`

and
`panel = panel.calibration`

For the `ggplot`

method, confidence intervals on the estimated proportions (from
`binom.test`

) are also shown.

# NOT RUN { data(mdrr) mdrrDescr <- mdrrDescr[, -nearZeroVar(mdrrDescr)] mdrrDescr <- mdrrDescr[, -findCorrelation(cor(mdrrDescr), .5)] inTrain <- createDataPartition(mdrrClass) trainX <- mdrrDescr[inTrain[[1]], ] trainY <- mdrrClass[inTrain[[1]]] testX <- mdrrDescr[-inTrain[[1]], ] testY <- mdrrClass[-inTrain[[1]]] library(MASS) ldaFit <- lda(trainX, trainY) qdaFit <- qda(trainX, trainY) testProbs <- data.frame(obs = testY, lda = predict(ldaFit, testX)$posterior[,1], qda = predict(qdaFit, testX)$posterior[,1]) calibration(obs ~ lda + qda, data = testProbs) calPlotData <- calibration(obs ~ lda + qda, data = testProbs) calPlotData xyplot(calPlotData, auto.key = list(columns = 2)) # } # NOT RUN { # }