
Last chance! 50% off unlimited learning
Sale ends in
scatterPlot(mydata, x = "nox", y = "no2", z = NA,
method = "scatter", group = NA, avg.time = "default",
data.thresh = 0, statistic = "mean", percentile = NA,
type = "default", smooth = FALSE, spline = FALSE,
linear = FALSE, ci = TRUE, mod.line = FALSE,
cols = "hue", plot.type = "p", key = TRUE,
key.title = group, key.columns = 1,
key.position = "right", strip = TRUE, log.x = FALSE,
log.y = FALSE, x.inc = 10, y.inc = 10,
y.relation = "same", x.relation = "same", ref.x = NULL,
ref.y = NULL, k = 100, trans = TRUE, map = FALSE,
auto.text = TRUE, ...)
x
can be
one of the openair
built in types such as
"year"
or "season"
.method = "scatter"
or method = "level"
.
Note that for method = "scatter"
points will be
coloured according to a continuous colour scale, whereas
for method
"scatter"
(conventional scatter plot), "hexbin"
(hexagonal
binning using the hexbin
package). level
for a binned or smooth surface plot and "density"
(2D kernel avg.time
. A value
of zero means that all available data will be used in a
particular period regardless if of the number of values
available. Conversely, a value of statistic = "percentile"
and when aggregating the
data with avg.time
. The default is 95. Not used if
avg.time = "default"
.type
determines how the data are split
i.e. conditioned, and then plotted. The default is will
produce a single plot using the entire data. Type can be
one of the built-in types as detailed in cutData
e.g. "season", "TRUE
; optionally with 95 shown. For method = "level"
a smooth surface will
be fitted to binned data.TRUE
. This is particularly useful when there are
fewer data points or when a connection line between a
sequence of points is required.TRUE
; optionally with 95 shown. The equation of the line and R2 value is also
shown.TRUE
three lines are added to
the scatter plot to help inform model evaluation. The 1:1
line is solid and the 1:0.5 and 1:2 lines are dashed.
Together these lines help show how close a group of
points are to a 1:1 relationship cols = "black"
.lattice
plot type. Can be "p"
(points --- default), "l" (lines) or "b" (lines and
points).TRUE
.columns
to be less than the number of
pollutants."top"
, "right"
, "bottom"
and
"left"
TRUE
.FALSE
. If TRUE
a well-formatted
log10 scale is used. This can be useful for checking
linearity once logged.FALSE
. If TRUE
a well-formatted
log10 scale is used. This can be useful for checking
linearity once logged.method = "level"
.method = "level"
.gam
for
fitting a smooth surface when method = "level"
.trans
is used when continuous =
TRUE
. Often for a good colour scale with skewed data it
is a good idea to "compress" the scale. If TRUE
a
square root transform is used, if FALSE
a linear
scalTRUE
(default) or
FALSE
. If TRUE
titles and axis labels will
automatically try and format pollutant names and units
properly e.g. by subscripting the `2' in NO2.cutData
and an appropriate lattice
plot
function (xyplot
, levelplot
or
hexbinplot
depending on method
). For
example, scatterPlot
also returns an object of class ``openair''. The object
includes three main components: call
, the command
used to generate the plot; data
, the data frame of
summarised information used to make the plot; and
plot
, the plot itself. If retained, e.g. using
output <- scatterPlot(mydata, "nox", "no2")
, this
output can be used to recover the data, reproduce or
rework the original plot or undertake further analysis.
An openair output can be manipulated using a number of
generic operations, including print
, plot
and summary
. See openair.generics
for further details.scatterPlot
is the basic function for plotting
scatterPlots in flexible ways in openair
. It is
flexible enough to consider lots of conditioning
variables and takes care of fitting smooth or linear
relationships to the data.
There are four main ways of plotting the relationship
between two variables, which are set using the
method
option. The default "scatter"
will
plot a conventional scatterPlot. In cases where there are
lots of data and over-plotting becomes a problem, then
method = "hexbin"
or method = "density"
can
be useful. The former requires the hexbin
package
to be installed.
There is also a method = "level"
which will bin
the x
and y
data according to the intervals
set for x.inc
and y.inc
and colour the bins
according to levels of a third variable, z
.
Sometimes however, a far better understanding of the
relationship between three variables (x
, y
and z
) is gained by fitting a smooth surface
through the data. See examples below.
A smooth fit is shown if smooth = TRUE
which can
help show the overall form of the data e.g. whether the
relationship appears to be linear or not. Also, a linear
fit can be shown using linear = TRUE
as an option.
The user has fine control over the choice of colours and
symbol type used.
Another way of reducing the number of points used in the
plots which can sometimes be useful is to aggregate the
data. For example, hourly data can be aggregated to daily
data. See timePlot
for examples here.
By default plots are shown with a colour key at the
bottom and in the case of conditioning, strips on the top
of each plot. Sometimes this may be overkill and the user
can opt to remove the key and/or the strip by setting
key
and/or strip
to FALSE
. One
reason to do this is to maximise the plotting area and
therefore the information shown.linearRelation
, timePlot
and
timeAverage
for details on selecting
averaging times and other statistics in a flexible way# load openair data if not loaded already
data(mydata)
# basic use, single pollutant
scatterPlot(mydata, x = "nox", y = "no2")
# scatterPlot by year
scatterPlot(mydata, x = "nox", y = "no2", type = "year")
# scatterPlot by day of the week, removing key at bottom
scatterPlot(mydata, x = "nox", y = "no2", type = "weekday", key =
FALSE)
# example of the use of continuous where colour is used to show
# different levels of a third (numeric) variable
# plot daily averages and choose a filled plot symbol (pch = 16)
# select only 2004
dat2004 <- selectByDate(mydata, year = 2004)
scatterPlot(dat2004, x = "nox", y = "no2", z = "co", avg.time = "day", pch = 16)
# show linear fit, by year
scatterPlot(mydata, x = "nox", y = "no2", type = "year", smooth =
FALSE, linear = TRUE)
# do the same, but for daily means...
scatterPlot(mydata, x = "nox", y = "no2", type = "year", smooth =
FALSE, linear = TRUE, avg.time = "day")
# log scales
scatterPlot(mydata, x = "nox", y = "no2", type = "year", smooth =
FALSE, linear = TRUE, avg.time = "day", log.x = TRUE, log.y = TRUE)
# also works with the x-axis in date format (alternative to timePlot)
scatterPlot(mydata, x = "date", y = "no2", avg.time = "month",
key = FALSE)
## multiple types and grouping variable and continuous colour scale
scatterPlot(mydata, x = "nox", y = "no2", z = "o3", type = c("season", "weekend"))
# use hexagonal binning
library(hexbin)
# basic use, single pollutant
scatterPlot(mydata, x = "nox", y = "no2", method = "hexbin")
# scatterPlot by year
scatterPlot(mydata, x = "nox", y = "no2", type = "year", method =
"hexbin")
## bin data and plot it - can see how for high NO2, O3 is also high
\dontrun{
scatterPlot(mydata, x = "nox", y = "no2", z = "o3", method = "level", x.inc = 10, y.inc = 2)
}
## fit surface for clearer view of relationship - clear effect of
## increased O3
\dontrun{
scatterPlot(mydata, x = "nox", y = "no2", z = "o3", method = "level",
x.inc = 10, y.inc = 2, smooth = TRUE)
}
Run the code above in your browser using DataLab