timeAverage(mydata, avg.time = "day", data.thresh = 0, statistic = "mean", type = "default", percentile = NA, start.date = NA, end.date = NA, interval = NA, vector.ws = FALSE, fill = FALSE, ...)
date
field . Can be
class POSIXct
or Date
.period = "2 month"
. In
addition, avg.time
can equal season, in which
case 3-month seasonal values are calculated with spring defined
as March, April, May and so on.Note that avg.time
can be less than the time
interval of the original series, in which case the series is
expanded to the new time interval. This is useful, for example,
for calculating a 15-minute time series from an hourly one where
an hourly value is repeated for each new 15-minute period. Note
that when expanding data in this way it is necessary to ensure
that the time interval of the original series is an exact
multiple of avg.time
e.g. hour to 10 minutes, day to
hour. Also, the input time series must have consistent time gaps
between successive intervals so that timeAverage
can work
out how much padding to apply. To pad-out data in this
way choose fill = TRUE
.
NA
. See also interval
,
start.date
and end.date
to see whether it is
advisable to set these other options.avg.time = "default"
.type
allows timeAverage
to be applied to
cases where there are groups of data that need to be split and
the function applied to each group. The most common example is
data with multiple sites identified with a column representing
site name e.g. type = "site"
. More generally, type
should be used where the date repeats for a particular grouping
variable. However, if type is not supplied the data will still
be averaged but the grouping variables (character or factor)
will be dropped.statistic = "percentile"
. The default is 95.start.date
is therefore used to force this type
of sequence.data.thresh
> 0 but the
input time series does not extend up to the final full interval.
For example, if a time series ends sometime in October but
annual means are required with a data capture of >75% then it
is necessary to extend the time series up until the end of the
year. Input in the format yyyy-mm-dd HH:MM.timeAverage
function tries to determine
the interval of the original time series (e.g. hourly) by
calculating the most common interval between time steps. The
interval is needed for calculations where the data.thresh
>0. For the vast majority of regular time series this works
fine. However, for data with very poor data capture or irregular
time series the automatic detection may not work. Also, for time
series such as monthly time series where there is a variable
difference in time between months users should specify the time
interval explicitly e.g. interval = "month"
. Users can
also supply a time interval to force on the time series.
See avg.time
for the format.This option can sometimes be useful with start.date
and
end.date
to ensure full periods are considered e.g. a
full year when avg.time = "year"
.
FALSE
and scalar
averages are calculated. Vector averaging of the wind speed is
carried out on the u and v wind components. For example,
consider the average of two hours where the wind direction and
speed of the first hour is 0 degrees and 2m/s and 180 degrees
and 2m/s for the second hour. The scalar average of the wind
speed is simply the arithmetic average = 2m/s and the vector
average is 0m/s. Vector-averaged wind speeds will always be
lower than scalar-averaged values.NA
. To pad-out the
additional data with the first row in each original time
interval, choose fill = TRUE
.timeAverage
.POSIXct
.
When a data capture threshold is set through data.thresh
it
is necessary for timeAverage
to know what the original time
interval of the input time series is. The function will try and
calculate this interval based on the most common time gap (and
will print the assumed time gap to the screen). This works fine
most of the time but there are occasions where it may not e.g.
when very few data exist in a data frame or the data are monthly
(i.e. non-regular time interval between months). In this case the
user can explicitly specify the interval through interval
in the same format as avg.time
e.g. interval =
"month"
. It may also be useful to set start.date
and
end.date
if the time series do not span the entire period
of interest. For example, if a time series ended in October and
annual means are required, setting end.date
to the end of
the year will ensure that the whole period is covered and that
data.thresh
is correctly calculated. The same also goes for
a time series that starts later in the year where
start.date
should be set to the beginning of the year.
timeAverage
should be useful in many circumstances where it
is necessary to work with different time average data. For
example, hourly air pollution data and 15-minute meteorological
data. To merge the two data sets timeAverage
can be used to
make the meteorological data 1-hour means first. Alternatively,
timeAverage
can be used to expand the hourly data to 15
minute data - see example below.
For the research community timeAverage
should be useful for
dealing with outputs from instruments where there are a range of
time periods used.
It is also very useful for plotting data using
timePlot
. Often the data are too dense to see
patterns and setting different averaging periods easily helps with
interpretation.
timePlot
that plots time series data
and uses timeAverage
to aggregate data where necessary.
## daily average values
daily <- timeAverage(mydata, avg.time = "day")
## daily average values ensuring at least 75 % data capture
## i.e. at least 18 valid hours
## Not run: daily <- timeAverage(mydata, avg.time = "day", data.thresh = 75)
## 2-weekly averages
## Not run: fortnight <- timeAverage(mydata, avg.time = "2 week")
## make a 15-minute time series from an hourly one
## Not run:
# min15 <- timeAverage(mydata, avg.time = "15 min", fill = TRUE)
# ## End(Not run)
# average by grouping variable
## Not run:
# dat <- importAURN(c("kc1", "my1"), year = 2011:2013)
# timeAverage(dat, avg.time = "year", type = "site")
#
# # can also retain site code
# timeAverage(dat, avg.time = "year", type = c("site", "code"))
#
# # or just average all the data, dropping site/code
# timeAverage(dat, avg.time = "year")
# ## End(Not run)
Run the code above in your browser using DataLab