Learn R Programming

IDetect (version 0.1.0)

ht_ID_cplm: Apply the Isolate-Detect methodology for multiple change-point detection in a continuous, piecewise-linear vector with non Gaussian noise

Description

Using the Isolate-Detect methodology, this function estimates the number and locations of multiple change-points in the noisy, continuous, piecewise-linear input vector x, with noise that is not normally distributed. It also gives the estimated signal, as well as the solution path defined in sol_path_cplm (see Details for the relevant literature reference).

Usage

ht_ID_cplm(x, s.ht = 3, q_ht = 300, ht_thr_id = 1.4, ht_th_ic_id = 1.25,
  p_thr = 1, p_ic = 3)

Arguments

x

A numeric vector containing the data in which you would like to find change-points.

s.ht

A positive integer number with default value equal to 3. It is used to define the way we pre-average the given data sequence. For more information see Details.

q_ht

A positive integer number with default value equal to 300. If the length of x is less than or equal to q_ht, then no pre-averaging will take place.

ht_thr_id

A positive real number with default value equal to 1.4. It is used to define the threshold, if the thresholding approach (described in cplm_th) is to be followed.

ht_th_ic_id

A positive real number with default value equal to 1.25. It is useful only if the model selection based Isolate-Detect method is to be followed and it is used to define the threshold value that will be used at the first step (change-point overestimation) of the model selection approach described in cplm_ic. It is applied to the new data, which are obtained after we take average values on x.

p_thr

A positive integer with default value equal to 1. It is used only when the threshold based approach (described in cplm_th) is to be followed and it defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively.

p_ic

A positive integer with default value equal to 3. It is used only when the information criterion based approach (described in cplm_ic) is to be followed and it defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively.

Value

A list with the following components:

cpt
A vector with the detected change-points. no_cpt
The number of change-points detected. fit
A numeric vector with the estimated continuous piecewise-linear signal.

Details

Firstly, in this function we call normalise, in order to create a new data sequence, \(\tilde{x}\), by taking averages of observations in x. Then, we employ ID_cplm on \(\tilde{x}_q\) to obtain the change-points, namely \(\tilde{r}_1, \tilde{r}_2, ..., \tilde{r}_{\hat{N}}\) in increasing order. To obtain the original location of the change-points with, on average, the highest accuracy we define $$\hat{r}_k = (\tilde{r}_{k}-1)*\code{s.ht} + \lfloor \code{s.ht}/2 + 0.5 \rfloor, k=1, 2,..., \hat{N}.$$ More details can be found in ``Detecting multiple generalized change-points by isolating single ones'', Anastasiou and Fryzlewicz (2018), preprint.

See Also

ID_cplm and normalise, which are functions that are used in ht_ID_cplm. In addition, see ht_ID_pcm for the case of piecewise-constant mean signals.

Examples

Run this code
# NOT RUN {
single.cpt <- c(seq(0, 1999, 1), seq(1998, -1, -1))
single.cpt.student <- single.cpt + rt(4000, df = 5)
cpt.single <- ht_ID_cplm(single.cpt.student)

three.cpt <- c(seq(0, 3998, 2), seq(3996, -2, -2), seq(0,3998,2), seq(3996,-2,-2))
three.cpt.student <- three.cpt + rt(8000, df = 5)
cpt.three <- ht_ID_cplm(three.cpt.student)
# }

Run the code above in your browser using DataLab