Learn R Programming

openair (version 0.9-2)

modStats: Calculate common model evaluation statistics

Description

Function to calculate common numerical model evaluation statistics with flexible conditioning

Usage

modStats(mydata, mod = "mod", obs = "obs", statistic = c("n", "FAC2",
  "MB", "MGE", "NMB", "NMGE", "RMSE", "r", "COE"), type = "default",
  rank.name = NULL, ...)

Arguments

mydata
A data frame.
mod
Name of a variable in mydata that respresents modelled values.
obs
Name of a variable in mydata that respresents measured values.
statistic
The statistic to be calculated. See details below for a description of each.
type
type determines how the data are split i.e. conditioned, and then plotted. The default is will produce statistics using the entire data. type can be one of the built-in types as detailed in cutData e.g.
rank.name
Simple model ranking can be carried out if rank.name is supplied. rank.name will generally refer to a column representing a model name, which is to ranked. The ranking is based the COE performance, as that indicator is ar
...
Other aruments to be passed to cutData e.g. hemisphere = "southern"

Value

  • Returns a data frame with model evaluation statistics.

Details

This function is under development and currently provides some common model evaluation statistics. These include (to be mathematically defined later):

  • $n$, the number of complete pairs of data.
  • $FAC2$, fraction of predictions within a factor of two.
  • $MB$, the mean bias.
  • $MGE$, the mean gross error.
  • $NMB$, the normalised mean bias.
  • $NMGE$, the normalised mean gross error.
  • $RMSE$, the root mean squared error.
  • $r$, the Pearson correlation coefficient. Note, can also supply and aurumentmethode.g.method = "spearman"
  • $COE$, theCoefficient of Efficiencybased on Legates and McCabe (1999, 2012). There have been many suggestions for measuring model performance over the years, but the COE is a simple formulation which is easy to interpret.

A perfect model has a COE = 1. As noted by Legates and McCabe although the COE has no lower bound, a value of COE = 0.0 has a fundamental meaning. It implies that the model is no more able to predict the observed values than does the observed mean. Therefore, since the model can explain no more of the variation in the observed values than can the observed mean, such a model can have no predictive advantage.

For negative values of COE, the model is less effective than the observed mean in predicting the variation in the observations.

All statistics are based on complete pairs of mod and obs.

Conditioning is possible through setting type, which can be a vector e.g. type = c("weekday", "season").

Details of the formulas are given in the openair manual.

References

Legates DR, McCabe GJ. (1999). Evaluating the use of goodness-of-fit measures in hydrologic and hydroclimatic model validation. Water Resources Research 35(1): 233-241.

Legates DR, McCabe GJ. (2012). A refined index of model performance: a rejoinder, International Journal of Climatology.

Examples

Run this code
## the example below is somewhat artificial --- assuming the observed
## values are given by NOx and the predicted values by NO2.

modStats(mydata, mod = "no2", obs = "nox")

## evaluation stats by season

modStats(mydata, mod = "no2", obs = "nox", type = "season")

Run the code above in your browser using DataLab