Usage
scatterPlot(mydata,
x = "nox",
y = "no2",
method = "scatter",
group = NULL,
avg.time = "default",
data.thresh = 0,
statistic = "mean",
percentile = NA,
type = "default",
layout = NULL,
smooth = TRUE,
spline = FALSE,
linear = FALSE,
ci = TRUE,
mod.line = FALSE,
cols = "hue",
main = "",
ylab = y,
xlab = x,
pch = 1,
lwd = 1,
key = TRUE,
key.title = group,
key.columns = 1,
strip = TRUE,
log.x = FALSE,
log.y = FALSE,
y.relation = "same",
x.relation = "same",
nbin = 256,
continuous = FALSE,
trans = TRUE,
auto.text = TRUE,
...)
Arguments
mydata
A data frame containing at least two numeric variables to plot.
x
Name of the x-variable to plot. Note that x can be a date
field or a factor. For example, x
can be one of the
openair
built in types such as "year"
or
"season"
. If x
is a factor a b
y
Name of the numeric y-variable to plot.
method
Methods include "scatter"
(conventional scatter
plot), "hexbin"
(hexagonal binning using the hexbin
package) and "density"
(2D kernel density estimates).
group
The grouping variable to use, if any. Setting this to a
variable in the data frame has the effect of plotting several series
in the same panel using different symbols/colours etc. If set to a
variable that is a character or factor, those categori
avg.time
This defines the time period to average to. Can be "sec",
"min", "hour", "day", "DSTday", "week", "month", "quarter" or
"year". For much increased flexibility a number can precede these
options followed by a space. For example, a timeAverage of 2
data.thresh
The data capture threshold to use (%) when
aggregating the data using avg.time
. A value of zero means that
all available data will be used in a particular period regardless if
of the number of values available. Conversely, a value of
statistic
The statistic to apply when aggregating the data;
default is the mean. Can be one of "mean", "max", "min", "median",
"frequency", "sd", "percentile". Note that "sd" is the standard
deviation and "frequency" is the number (frequency) of valid
r
percentile
The percentile level in % used when statistic =
"percentile"
and when aggregating the data with
avg.time
. The default is 95. Not used if avg.time =
"default"
.
type
type
determines how the data are split
i.e. conditioned, and then plotted. The default is will produce a
single plot using the entire data. Type can be one of the built-in
types as detailed in cutData
e.g. "season"
layout
Determines how the panels are laid out. By default,
plots will be shown in one column with the number of rows equal to the
number of pollutants, for example. If the user requires 2 columns and
two rows, layout should be set to layout
smooth
A smooth line is fitted to the data if TRUE
;
optionally with 95% confidence intervals shown.
spline
A smooth spline is fitted to the data if
TRUE
. This is particularly useful when there are fewer data
points or when a connection line between a sequence of points is
required.
linear
A linear model is fitted to the data if TRUE
;
optionally with 95% confidence intervals shown. The equation of the
line and R2 value is also shown.
ci
Should the confidence intervals for the smooth/linear fit be
shown?
mod.line
If TRUE
three lines are added to the scatter
plot to help inform model evaluation. The 1:1 line is solid and the
1:0.5 and 1:2 lines are dashed. Together these lines help show how
close a group of points are to a 1:1 relationship and al
cols
Colours to be used for plotting. Options include "default",
"increment", "heat", "spectral", "hue", "brewer1" and user
defined (see manual for more details). The same line colour can be
set for all pollutant e.g. cols = "black"
main
The plot title; default is no title.
ylab
Name of y-axis variable. By default will use the name of
y
.
xlab
Name of x-axis variable. By default will use the name of
x
.
pch
The symbol type used for plotting. Default is to provide
different symbol types for different pollutant. If one requires a
single symbol for all pollutants, the set pch = 1
, for
example.
key
Should a key be drawn? The default is TRUE
.
key.title
The title of the key (if used).
key.columns
Number of columns to be used in the key. With many
pollutants a single column can make to key too wide. The user can thus
choose to use several columns by setting columns
to be less
than the number of pollutants.
strip
Should a strip be drawn? The default is TRUE
.
log.x
Should the x-axis appear on a log scale? The default is
FALSE
. If TRUE
a well-formatted log10 scale is
used. This can be useful for checking linearity once logged.
log.y
Should the y-axis appear on a log scale? The default is
FALSE
. If TRUE
a well-formatted log10 scale is
used. This can be useful for checking linearity once logged.
y.relation
This determines how the y-axis scale is
plotted. "same" ensures all panels use the same scale and "free" will
use panel-specfic scales. The latter is a useful setting when plotting
data with very different values.
x.relation
This determines how the y-axis scale is
plotted. "same" ensures all panels use the same scale and "free" will
use panel-specfic scales. The latter is a useful setting when plotting
data with very different values.
nbin
Number of bins used for kernel density output using method
"density"
.
continuous
When this option is TRUE
a plot of x vs. y
will be made, colour-coded by levels of group
, provided
group
is a numeric variable. A continuous
separate colour scale is shown. If continuous = FALSE
an
trans
trans
is used when continuous = TRUE
. Often
for a good colour scale with skewed data it is a good idea to
"compress" the scale. If TRUE
a square root transform is used,
if FALSE
a linear scale i
auto.text
Either TRUE
(default) or FALSE
. If TRUE
titles and axis labels will automatically try and format pollutant
names and units properly e.g. by subscripting the `2' in NO2.
...
Other graphical parameters.