xyplot.resamples
Lattice Functions for Visualizing Resampling Results
Lattice and ggplot functions for visualizing resampling results across models
- Keywords
- hplot
Usage
# S3 method for resamples
xyplot(
x,
data = NULL,
what = "scatter",
models = NULL,
metric = x$metric[1],
units = "min",
...
)# S3 method for resamples
parallelplot(x, data = NULL, models = x$models, metric = x$metric[1], ...)
# S3 method for resamples
splom(
x,
data = NULL,
variables = "models",
models = x$models,
metric = NULL,
panelRange = NULL,
...
)
# S3 method for resamples
densityplot(x, data = NULL, models = x$models, metric = x$metric, ...)
# S3 method for resamples
bwplot(x, data = NULL, models = x$models, metric = x$metric, ...)
# S3 method for resamples
dotplot(
x,
data = NULL,
models = x$models,
metric = x$metric,
conf.level = 0.95,
...
)
# S3 method for resamples
ggplot(
data = NULL,
mapping = NULL,
environment = NULL,
models = data$models,
metric = data$metric[1],
conf.level = 0.95,
...
)
Arguments
- x
an object generated by
resamples
- data
Only used for the
ggplot
method; an object generated byresamples
- what
for
xyplot
, the type of plot. Valid options are: "scatter" (for a plot of the resampled results between two models), "BlandAltman" (a Bland-Altman, aka MA plot between two models), "tTime" (for the total time to runtrain
versus the metric), "mTime" (for the time to build the final model) or "pTime" (the time to predict samples - see thetimingSamps
options intrainControl
,rfeControl
, orsbfControl
)- models
a character string for which models to plot. Note:
xyplot
requires one or two models whereas the other methods can plot more than two.- metric
a character string for which metrics to use as conditioning variables in the plot.
splom
requires exactly one metric whenvariables = "models"
and at least two whenvariables = "metrics"
.- units
either "sec", "min" or "hour"; which
what
is either "tTime", "mTime" or "pTime", how should the timings be scaled?- …
further arguments to pass to either
histogram
,densityplot
,xyplot
,dotplot
orsplom
- variables
either "models" or "metrics"; which variable should be treated as the scatter plot variables?
- panelRange
a common range for the panels. If
NULL
, the panel ranges are derived from the values across all the models- conf.level
the confidence level for intervals about the mean (obtained using
t.test
)- mapping, environment
Not used.
Details
The ideas and methods here are based on Hothorn et al. (2005) and Eugster et al. (2008).
dotplot
and ggplot
plots the average performance value (with two-sided
confidence limits) for each model and metric.
densityplot
and bwplot
display univariate visualizations of
the resampling distributions while splom
shows the pair-wise
relationships.
Value
a lattice object
References
Hothorn et al. The design and analysis of benchmark experiments. Journal of Computational and Graphical Statistics (2005) vol. 14 (3) pp. 675-699
Eugster et al. Exploratory and inferential analysis of benchmark experiments. Ludwigs-Maximilians-Universitat Munchen, Department of Statistics, Tech. Rep (2008) vol. 30
See Also
Examples
# NOT RUN {
# }
# NOT RUN {
#load(url("http://topepo.github.io/caret/exampleModels.RData"))
resamps <- resamples(list(CART = rpartFit,
CondInfTree = ctreeFit,
MARS = earthFit))
dotplot(resamps,
scales =list(x = list(relation = "free")),
between = list(x = 2))
bwplot(resamps,
metric = "RMSE")
densityplot(resamps,
auto.key = list(columns = 3),
pch = "|")
xyplot(resamps,
models = c("CART", "MARS"),
metric = "RMSE")
splom(resamps, metric = "RMSE")
splom(resamps, variables = "metrics")
parallelplot(resamps, metric = "RMSE")
# }
# NOT RUN {
# }