conditionalQuantile
by also considering how other variables vary over the
same intervals. Conditional quantiles are very useful on
their own for model evaluation, but provide no direct
information on how other variables change at the same
time. For example, a conditional quantile plot of ozone
concentrations may show that low concentrations of ozone
tend to be under-predicted. However, the cause of the
under-prediction can be difficult to determine. However,
by considering how well the model predicts other
variables over the same intervals, more insight can be
gained into the underlying reasons why model performance
is poor.conditionalEval(mydata, obs = "obs", mod = "mod",
var.obs = "var.obs", var.mod = "var.mod",
type = "default", bins = 31, statistic = "MB",
xlab = "predicted value", ylab = "statistic",
col = brewer.pal(5, "YlOrRd"), col.var = "Set1",
var.names = NULL, auto.text = TRUE, ...)
obs
and mod
representing observed and
modelled values.mydata
.mydata
.var.obs = c("nox.obs", "ws.obs")
.var.obs = c("nox.obs", "ws.obs")
.type
determines how the data are split
i.e. conditioned, and then plotted. The default is will
produce a single plot using the entire data. Type can be
one of the built-in types as detailed in cutData
e.g. "season", "conditionalQuantile
.modStats
. When these statistics are
chosen, they are calculated from var.mod
and
var.mod
.
"predicted value"
."observed value"
.openColours
for more details.var.obs
and var.mod
.TRUE
(default) or
FALSE
. If TRUE
titles and axis labels etc.
will automatically try and format pollutant names and
units properly e.g. by subscripting the `2' in NO2.conditionalQuantile
and cutData
. For
example, conditionalQuantile
passes the option
hemisphere = "southern"
on to cutData
to
provide southeconditionalEval
function provides information
on how other variables vary across the same intervals as
shown on the conditional quantile plot. There are two
types of variable that can be considered by setting the
value of statistic
. First, statistic
can be
another variable in the data frame. In this case the plot
will show the different proportions of statistic
across the range of predictions. For example
statistic = "season"
will show for each interval
of mod
the proportion of predictions that were
spring, summer, autumn or winter. This is useful because
if model performance is worse for example at high
concentrations of mod
then knowing that these tend
to occur during a particular season etc. can be very
helpful when trying to understand why a model
fails. See cutData
for more details on the
types of variable that can be statistic
. Another
example would be statistic = "ws"
(if wind speed
were available in the data frame), which would then split
wind speed into four quantiles and plot the proportions
of each.
Second, conditionalEval
can simultaneously plot
the model performance of other observed/predicted
variable pairs according to different model
evaluation statistics. These statistics derive from the
modStats
function and include "MB", "NMB",
"r", "IOA", "MGE", "NMGE", "RMSE" and "FAC2". More than
one statistic can be supplied e.g. statistic =
c("NMB", "IOA")
. Bootstrap samples are taken from the
corresponding values of other variables to be plotted and
their statistics with 95% confidence intervals
calculated. In this case, the model performance of
other variables is shown across the same intervals of
mod
, rather than just the values of single
variables. In this second case the model would need to
provide observed/predicted pairs of other variables.
For example, a model may provide predictions of NOx and
wind speed (for which there are also observations
available). The conditionalEval
function will show
how well these other variables are predicted for the same
intervals of the main variables assessed in the
conditional quantile e.g. ozone. In this case, values are
supplied to var.obs
(observed values for other
variables) and var.mod
(modelled values for other
variables). For example, to consider how well the model
predicts NOx and wind speed var.obs = c("nox.obs",
"ws.obs")
and var.obs = c("nox.mod", "ws.mod")
would be supplied (assuming nox.obs, nox.mod,
ws.obs, ws.mod
are present in the data frame). The
analysis could show for example, when ozone
concentrations are under-predicted, the model may also be
shown to over-predict concentrations of NOx at the same
time, or under-predict wind speeds. Such information can
thus help identify the underlying causes of poor model
performance. For example, an under-prediction in wind
speed could result in higher surface NOx concentrations
and lower ozone concentrations. Similarly if wind speed
predictions were good and NOx was over predicted it might
suggest an over-estimate of NOx emissions. One or more
additional variables can be plotted.
A special case is statistic = "cluster"
. In this
case a data frame is provided that contains the cluster
calculated by trajCluster
and
importTraj
. Alternatively users could
supply their own pre-calculated clusters. These
calculations can be very useful in showing whether
certain back trajectory clusters are associated with poor
(or good) model performance. Note that in the case of
statistic = "cluster"
there will be fewer data
points used in the analysis compared with the ordinary
statistics above because the trajectories are available
for every three hours. Also note that statistic =
"cluster"
cannot be used together with the ordinary
model evaluation statistics such as MB. The output will
be a bar chart showing the proportion of each interval of
mod
by cluster number.
Far more insight can be gained into model performance
through conditioning using type
. For example,
type = "season"
will plot conditional quantiles
and the associated model performance statistics of other
variables by each season. type
can also be a
factor or character field e.g. representing different
models used.
See Wilks (2005) for more details of conditional quantile
plots.conditionalQuantile
for information on
conditional quantiles, modStats
for model
evaluation statistics and the package verification
for comprehensive functions for forecast verification.## Examples to follow, or will be shown in the openair manual
Run the code above in your browser using DataLab