Takes a series of dates and temperatures, and if irregular (but ordered), inserts missing dates and fills correpsonding temperatures with NAs.
make_whole_fast(data, x = t, y = temp)
A data frame with columns for date and temperature data. Ordered daily data are expected, and although missing values (NA) can be accommodated, the function is only recommended when NAs occur infrequently, preferably at no more than 3 consecutive days.
A column with the daily time vector (see details). For backwards
compatibility, the column is named t
by default.
A column with the response vector. RmarineHeatWaves version <= 0.15.9
assumed that this would be daily seawater temperatures, but as of version 0.16.0
it may be any arbitrary measurement taken at a daily frequency. The default
remains temperature, and the default column name is therefore temp
, again
hopefully ensuring backwards compatibility.
The function will return a data frame with three columns. The column
headed doy
(day-of-year) is the Julian day running from 1 to 366, but
modified so that the day-of-year series for non-leap-years runs 1...59 and
then 61...366. For leap years the 60th day is February 29. See the example,
below. The other two columns take the names of x
and y
, if supplied,
or it will be t
and temp
in case the default values were used.
The x
(or t
) column is a series of dates of class Date
,
while y
(or temp
) is the measured variable. This time series will
be uninterrupted and continuous daily values between the first and last dates
of the input data.
This function reads in daily data with the time vector specified as
either POSIXct
or Date
(e.g. "1982-01-01 02:00:00" or
"1982-01-01").
It is up to the user to calculate daily data from sub-daily measurements. Leap years are automatically accommodated by this function.
This function can handle some of missing days, but this is not a
licence to actually use these data for the detection of anomalous thermal
events. Hobday et al. (2016) recommend gaps of no more than 3 days, which
may be adjusted by setting the maxPadLength
argument of the
ts2clm
function. The longer and more frequent the gaps become
the lower the fidelity of the annual climatology and threshold that can be
calculated, which will not only have repercussions for the accuracy at which
the event metrics can be determined, but also for the number of events that
can be detected.
The original make_whole
tests to see if some rows are
duplicated, or if replicate temperature measurements are present per day. In
make_whole_fast
(this function) this has been disabled; also,
the latter function lacks the facility to check if the time series is complete
and regular (i.e. no missing values in the date vector). Effectively,
we now only set up the day-of-year (doy) vector in make_whole_fast
.
Should the user be concerned about the potential for repeated measurements
or worry that the time series is irregular, we suggest that the necessary
checks and fixes are implemented prior to feeding the time series to ts2clim
via make_whole_fast
, or to use make_whole
instead. For very large
gridded temperature records it probably makes a measurable difference if the
'fast' version is used, but it might prevent detect_event
from failing should some gridded cells contain missing rows or some duplicated
values. When using the fast algorithm, we assume that the user has done all
the necessary work to ensure that the time vector is regular and without
repeated measurements beforehand.
It is recommended that a climatology period of at least 30 years is specified in order to capture any decadal thermal periodicities.