Inferential Assessments About Model Performance

Methods for making inferences about differences between models

## S3 method for class 'resamples':
diff(x, models = x$models, metric = x$metrics, 
     test = t.test, 
     confLevel = 0.95, adjustment = "bonferroni",

## S3 method for class 'diff.resamples': summary(object, digits = max(3, getOption("digits") - 3), ...)


The ideas and methods here are based on Hothorn et al (2005) and Eugster et al (2008).

For each metric, all pair-wise differences are computed and tested to assess if the difference is equal to zero.

When a Bonferroni correction is used, the confidence level is changed from confLevel to 1-((1-confLevel)/p) here p is the number of pair-wise comparisons are being made. For other correction methods, no such change is used.


  • An object of class "diff.resamples" with elements:
  • callthe call
  • difsa list for each metric being compared. Each list contains a matrix with differences in columns and resamples in rows
  • statisticsa list of results generated by test
  • adjustmentthe p-value adjustment used
  • modelsa character string for which models were compared.
  • metricsa character string of performance metrics that were used
  • or...

    An object of class "summary.diff.resamples" with elements:

  • callthe call
  • tablea list of tables that show the differences and p-values


Hothorn et al. The design and analysis of benchmark experiments. Journal of Computational and Graphical Statistics (2005) vol. 14 (3) pp. 675-699

Eugster et al. Exploratory and inferential analysis of benchmark experiments. Ludwigs-Maximilians-Universitat Munchen, Department of Statistics, Tech. Rep (2008) vol. 30

See Also

resamples, dotplot.diff.resamples, densityplot.diff.resamples, bwplot.diff.resamples, levelplot.diff.resamples


resamps <- resamples(list(CART = rpartFit,
                          CondInfTree = ctreeFit,
                          MARS = earthFit))

difs <- diff(resamps)


