Learn R Programming

openair (version 0.4-7)

conditionalQuantile: Conditional quantile estimates for model evaluation

Description

Function to calculate conditional quantiles with flexible conditioning. The function is for use in model evaluation and more generally to help better understand forecast predictions and how well they agree with observations.

Usage

conditionalQuantile(mydata, obs = "obs", mod = "mod",
type = "default",
bins = 31,
min.bin = c(10, 20),
xlab = "predicted value",
ylab = "observed value",
col = brewer.pal(5, "YlOrRd"),
key.columns = 2,
key.position = "bottom",
auto.text = TRUE, ...)

Arguments

mydata
A data frame containing the field obs and mod representing observed and modelled values.
obs
The name of the observations in mydata.
mod
The name of the predictions (modelled values) in mydata.
type
type determines how the data are split i.e. conditioned, and then plotted. The default is will produce a single plot using the entire data. Type can be one of the built-in types as detailed in cutData e.g. "season"
bins
Number of bins to be used in calculating the different quantile levels.
min.bin
The minimum number of points required for the estimates of the 25/75th and 10/90th percentiles.
xlab
label for the x-axis.
ylab
label for the y-axis.
col
Colours to be used for plotting the uncertainty bands and median line. Must be of length 5 or more.
key.columns
Number of columns to be used in the key.
key.position
Location of the key e.g. "top", "bottom", "right", "left". See lattice xyplot for more details.
auto.text
Either TRUE (default) or FALSE. If TRUE titles and axis labels etc. will automatically try and format pollutant names and units properly e.g. by subscripting the `2' in NO2.
...
Other graphical parameters passed onto lattice:xyplot and cutData. For example, in the case of cutData the option hemisphere = "southern".

Details

The conditionalQuantile function is a useful approach for comparing continuous observations and predictions i.e. forecasts. The function requires a data frame consisting of a column of observations and a column of predictions. The observations are split up into bins according to values of the predictions. The median prediction line together with the 25/75th and 10/90th quantile values are plotted together with a line showing a "perfect" model. Also shown is a histogram of predicted values. Far more insight can be gained into model performance through conditioning using type. For example, type = "season" will plot conditional quantiles by each season. type can also be a factor or character field e.g. representing different models used. See Wilks (2005) for more details and the examples below.

References

Wilks, D. S., 2005. Statistical Methods in the Atmospheric Sciences, Volume 91, Second Edition (International Geophysics), 2nd Edition. Academic Press.

See Also

See modStats for model evaluation statistics and the package verification for comprehensive functions for forecast verification.

Examples

Run this code
# load example data from package
data(mydata)

## make some dummy prediction data based on 'nox'
mydata$mod <- mydata$nox*1.1 + mydata$nox * runif(1:nrow(mydata))

# basic conditional quantile plot
## A "perfect" model is shown by the blue line
## predictions tend to be increasingly positively biased at high nox,
## shown by departure of median line from the blue one.
## The widening uncertainty bands with increasing NOx shows that
## hourly predictions are worse for higher NOx concentrations.
## Also, the red (median) line extends beyond the data (blue line),
## which shows in this case some predictions are much higher than
## the corresponding measurements. Note that the uncertainty bands
## do not extend as far as the median line because there is insufficient
# to calculate them
conditionalQuantile(mydata, obs = "nox", mod = "mod")

## can split by season to show seasonal performance (not very
## enlightening in this case - try some real data and it will be!)

conditionalQuantile(mydata, obs = "nox", mod = "mod", type = "season")

Run the code above in your browser using DataLab