bt_smoother: Bootstrap the Smoother

Description

First, fit an autoregressive model on the residuals of the smoother. Then bootstrap the errors of the autoregressive model. Afterwards, reconstruct the measurements by adding the bootstrapped error, the autoregressive model, and the smoother. We can again calculate the smoother using these reconstructed measurements to obtain the bootstrapped smoother (which can later be used to construct the simultaneous confidence bounds). For details see below.

Usage

bt_smoother(
  data,
  smoother,
  resample_method,
  smoother_pts,
  resid,
  bt_tot_rep,
  ...
)

Value

A data frame containing the bootstrap repetitions of the smoother. The column are subject identifier, time point, value, and the bootstrap repetition the value corrsponds to.

Arguments

data

A data frame in long format containing the data for which events is to be detected. This means that each measurement corresponds to a row and the columns are (in order): source (the device or person from which the data was collected), point in time, and measurement value. If custom detection bounds are chosen, the folloing two columns must be added: lower detection bound, and upper detection bound.

The source is expected to be a string; the time point are integers; measurements, and detection bounds are expected to be numerical. The detection bounds are in absolute value in the same unit as the values and each is expected to be identical for the same source.

In case detection is wanted for a one sided change (e.g. give an event if the confidence bounds drop below a threshold) then the upper or lower detection bound can be chosen to be Inf or -Inf respectively.

smoother

A string specifying which smoother is to be used. Use mov_med for the moving median. When using the moving median, the parameter med_win must be given to specify the size of the window over which the moving median is to be taken. Defaults to the moving median.

resample_method

A string that determines how to resample the errors of the autoregression for the bootstrap. Default is all, meaning that the epsilon of a certain time point are resampled from all time points. past only considers epsilon corresponding to a time point prior to the one being resampled. window resamples the epsilon from the window given by resample_win.

smoother_pts

A data frame containing the smoother with columns time_point and value. Preferably the output of one of the smoother functions within this package.

resid

A vector of the same length as the number of rows of data containing the difference between the smoother and the measurements. Preferably the output of smoother_resid.

bt_tot_rep

The number of iterations for the bootstrap computation. Because of run time, it is recommended to keep this number below 500. Defaults to 100.

...

Additional parameters to be given to the function. Possible parameters for the model are order and min_pts_in_win. For the moving median, a med_win is required. When resampling from window, a resample_win may be given.

The parameter min_pts_in_win defines the minimal number of measurements required to be in the time window for the median to be calculated. Defaults to 1.

If the parameter order is given, that number will be the (maximal) order of the autoregressive model. If no order is given, it will be determined using the Akaike information criterion.

When the moving median is used as the smoother, med_win is expected. If no med_win is given, it will default to c(-42, 42).

When resampling from window, one can choose the window size for the resampling window with resample_win by giving a window like e.g. c(-14,14)..

Details

An autoregressive (AR) model is used for the residuals of the smoother: $$Y(t) = S(t) + \eta(t)$$ $$\eta(t) = \sum^{p}_{j = 1} \phi_j \eta(t - j) + \epsilon$$ where $t$ is the point in time, $Y(t)$ the data point, $S(t)$ a smoother, $\eta(t)$ the residual of the smoother, $p$ the order of the AR model, $\phi_j$ the coefficients of the AR model, and $\epsilon$ the error of the AR model.

The bootstrap procedure is as follows:

Compute the smoother $S(t)$.
Compute the residuals $\eta(t_i) = Y(t_i) - S(t_i)$.
Fit an AR(p) model to $\eta(t_i)$ to obtain the coefficients $\phi_1, \dots, \phi_p$ and residuals $\epsilon(t_i) = \eta(t_i) - \sum^{p}_{j = 1} \phi_j \eta(t_i - t_{i-j})$.
Resample $\epsilon(t_i)*$ from $\epsilon(t_{p+1}), \dots, \epsilon(t_n)$ to obtain $$Y(t_i)* = S(t_i) + \eta(t_i)*,$$ where $$\eta(t_i)* = \sum^{p}_{j=1} \phi_j \eta(t_{i-j})*+ \epsilon(t_{i-j})*.$$
Compute $S(.)* = g(Y(t_1), \dots, Y(t_n))$ where $g$ is the function with which the smoother is calculated.
Repeat steps 4 and 5 bt_tot_rep times.

References

Bühlmann, P. (1998). Sieve Bootstrap for Smoothing in Nonstationary Time Series. The Annals of Statistics, 26(1), 48-83.