Inferential Assessments About Model Performance
Methods for making inferences about differences between models
## S3 method for class 'resamples': diff(x, models = x$models, metric = x$metrics, test = t.test, confLevel = 0.95, adjustment = "bonferroni", ...)
## S3 method for class 'diff.resamples': summary(object, digits = max(3, getOption("digits") - 3), ...)
The ideas and methods here are based on Hothorn et al (2005) and Eugster et al (2008).
For each metric, all pair-wise differences are computed and tested to assess if the difference is equal to zero.
When a Bonferroni correction is used, the confidence level is changed from
p is the number of pair-wise comparisons are being made. For other correction methods, no such change is used.
- An object of class
call the call difs a list for each metric being compared. Each list contains a matrix with differences in columns and resamples in rows statistics a list of results generated by
adjustment the p-value adjustment used models a character string for which models were compared. metrics a character string of performance metrics that were used
An object of class
call the call table a list of tables that show the differences and p-values
Hothorn et al. The design and analysis of benchmark experiments. Journal of Computational and Graphical Statistics (2005) vol. 14 (3) pp. 675-699
Eugster et al. Exploratory and inferential analysis of benchmark experiments. Ludwigs-Maximilians-Universitat Munchen, Department of Statistics, Tech. Rep (2008) vol. 30
#load(url("http://caret.r-forge.r-project.org/Classification_and_Regression_Training_files/exampleModels.RData")) resamps <- resamples(list(CART = rpartFit, CondInfTree = ctreeFit, MARS = earthFit)) difs <- diff(resamps) difs summary(difs)