Last chance! 50% off unlimited learning
Sale ends in
Lattice and ggplot functions for visualizing resampling results across models
# S3 method for resamples
xyplot(x, data = NULL, what = "scatter",
models = NULL, metric = x$metric[1], units = "min", ...)# S3 method for resamples
parallelplot(x, data = NULL, models = x$models,
metric = x$metric[1], ...)
# S3 method for resamples
splom(x, data = NULL, variables = "models",
models = x$models, metric = NULL, panelRange = NULL, ...)
# S3 method for resamples
densityplot(x, data = NULL, models = x$models,
metric = x$metric, ...)
# S3 method for resamples
bwplot(x, data = NULL, models = x$models,
metric = x$metric, ...)
# S3 method for resamples
dotplot(x, data = NULL, models = x$models,
metric = x$metric, conf.level = 0.95, ...)
# S3 method for resamples
ggplot(data = NULL, mapping = NULL,
environment = NULL, models = data$models, metric = data$metric[1],
conf.level = 0.95, ...)
an object generated by resamples
Only used for the ggplot
method; an object generated by
resamples
for xyplot
, the type of plot. Valid options are:
"scatter" (for a plot of the resampled results between two models),
"BlandAltman" (a Bland-Altman, aka MA plot between two models), "tTime" (for
the total time to run train
versus the metric), "mTime" (for the time
to build the final model) or "pTime" (the time to predict samples - see the
timingSamps
options in trainControl
,
rfeControl
, or sbfControl
)
a character string for which models to plot. Note:
xyplot
requires one or two models whereas the other methods can plot
more than two.
a character string for which metrics to use as conditioning
variables in the plot. splom
requires exactly one metric when
variables = "models"
and at least two when variables =
"metrics"
.
either "sec", "min" or "hour"; which what
is either
"tTime", "mTime" or "pTime", how should the timings be scaled?
further arguments to pass to either
histogram
,
densityplot
,
xyplot
, dotplot
or splom
either "models" or "metrics"; which variable should be treated as the scatter plot variables?
a common range for the panels. If NULL
, the panel
ranges are derived from the values across all the models
the confidence level for intervals about the mean
(obtained using t.test
)
Not used.
a lattice object
The ideas and methods here are based on Hothorn et al. (2005) and Eugster et al. (2008).
dotplot
and ggplot
plots the average performance value (with two-sided
confidence limits) for each model and metric.
densityplot
and bwplot
display univariate visualizations of
the resampling distributions while splom
shows the pair-wise
relationships.
Hothorn et al. The design and analysis of benchmark experiments. Journal of Computational and Graphical Statistics (2005) vol. 14 (3) pp. 675-699
Eugster et al. Exploratory and inferential analysis of benchmark experiments. Ludwigs-Maximilians-Universitat Munchen, Department of Statistics, Tech. Rep (2008) vol. 30
# NOT RUN {
# }
# NOT RUN {
#load(url("http://topepo.github.io/caret/exampleModels.RData"))
resamps <- resamples(list(CART = rpartFit,
CondInfTree = ctreeFit,
MARS = earthFit))
dotplot(resamps,
scales =list(x = list(relation = "free")),
between = list(x = 2))
bwplot(resamps,
metric = "RMSE")
densityplot(resamps,
auto.key = list(columns = 3),
pch = "|")
xyplot(resamps,
models = c("CART", "MARS"),
metric = "RMSE")
splom(resamps, metric = "RMSE")
splom(resamps, variables = "metrics")
parallelplot(resamps, metric = "RMSE")
# }
# NOT RUN {
# }
Run the code above in your browser using DataLab