# calibration

##### Probability Calibration Plot

For classification models, this function creates a 'calibration plot' that describes how consistent model probabilities are with observed event rates.

- Keywords
- hplot

##### Usage

`calibration(x, ...)`# S3 method for default
calibration(x, ...)

# S3 method for formula
calibration(x, data = NULL, class = NULL,
cuts = 11, subset = TRUE, lattice.options = NULL, ...)

# S3 method for calibration
print(x, ...)

# S3 method for calibration
xyplot(x, data = NULL, ...)

# S3 method for calibration
ggplot(data, ..., bwidth = 2, dwidth = 3)

##### Arguments

- x
a

`lattice`

formula (see`xyplot`

for syntax) where the left -hand side of the formula is a factor class variable of the observed outcome and the right-hand side specifies one or model columns corresponding to a numeric ranking variable for a model (e.g. class probabilities). The classification variable should have two levels.- …
options to pass through to

`xyplot`

or the panel function (not used in`calibration.formula`

).- data
For

`calibration.formula`

, a data frame (or more precisely, anything that is a valid`envir`

argument in`eval`

, e.g., a list or an environment) containing values for any variables in the formula, as well as`groups`

and`subset`

if applicable. If not found in`data`

, or if`data`

is unspecified, the variables are looked for in the environment of the formula. This argument is not used for`xyplot.calibration`

. For ggplot.calibration,`data`

should be an object of class "`calibration`

"."- class
a character string for the class of interest

- cuts
If a single number this indicates the number of splits of the data are used to create the plot. By default, it uses as many cuts as there are rows in

`data`

. If a vector, these are the actual cuts that will be used.- subset
An expression that evaluates to a logical or integer indexing vector. It is evaluated in

`data`

. Only the resulting rows of`data`

are used for the plot.- lattice.options
A list that could be supplied to

`lattice.options`

- bwidth, dwidth
a numeric value for the confidence interval bar width and dodge width, respectively. In the latter case, a dodge is only used when multiple models are specified in the formula.

##### Details

`calibration.formula`

is used to process the data and `xyplot.calibration`

is used to create the plot.

To construct the calibration plot, the following steps are used for each model:

The data are split into

`cuts - 1`

roughly equal groups by their class probabilitiesthe number of samples with true results equal to

`class`

are determinedthe event rate is determined for each bin

`xyplot.calibration`

produces a plot of the observed event rate by the mid-point of the bins.

This implementation uses the lattice function `xyplot`

, so plot
elements can be changed via panel functions, `trellis.par.set`

or
other means. `calibration`

uses the panel function `panel.calibration`

by default, but
it can be changed by passing that argument into `xyplot.calibration`

.

The following elements are set by default in the plot but can be changed by passing new values into
`xyplot.calibration`

: `xlab = "Bin Midpoint"`

, `ylab = "Observed Event Percentage"`

,
`type = "o"`

, `ylim = extendrange(c(0, 100))`

,`xlim = extendrange(c(0, 100))`

and
`panel = panel.calibration`

For the `ggplot`

method, confidence intervals on the estimated proportions (from
`binom.test`

) are also shown.

##### Value

`calibration.formula`

returns a list with elements:

the data used for plotting

the number of cuts

the event class

the names of the model probabilities

xyplot.calibration returns a lattice object

##### See Also

##### Examples

```
# NOT RUN {
data(mdrr)
mdrrDescr <- mdrrDescr[, -nearZeroVar(mdrrDescr)]
mdrrDescr <- mdrrDescr[, -findCorrelation(cor(mdrrDescr), .5)]
inTrain <- createDataPartition(mdrrClass)
trainX <- mdrrDescr[inTrain[[1]], ]
trainY <- mdrrClass[inTrain[[1]]]
testX <- mdrrDescr[-inTrain[[1]], ]
testY <- mdrrClass[-inTrain[[1]]]
library(MASS)
ldaFit <- lda(trainX, trainY)
qdaFit <- qda(trainX, trainY)
testProbs <- data.frame(obs = testY,
lda = predict(ldaFit, testX)$posterior[,1],
qda = predict(qdaFit, testX)$posterior[,1])
calibration(obs ~ lda + qda, data = testProbs)
calPlotData <- calibration(obs ~ lda + qda, data = testProbs)
calPlotData
xyplot(calPlotData, auto.key = list(columns = 2))
# }
# NOT RUN {
# }
```

*Documentation reproduced from package caret, version 6.0-84, License: GPL (>= 2)*