This is the main, general function of the package. It employs more specialised functions in
order to estimate the number and locations of multiple change-points in the noisy, piecewise-constant
or continuous, piecewise-linear input vector xd. The noise can either follow the Gaussian
distribution or not. The approach that is followed is a hybrid between the thresholding approach
(explained in pcm_th and cplm_th) and the information criterion approach
(explained in pcm_ic and cplm_ic) and estimates the change-points
taking into account both these approaches. Further to the number and the location of the estimated
change-points, ID, returns the estimated signal, as well as the solution path.
For more information and the relevant literature reference, see Details.
ID(xd, th.cons = 1, th.cons_lin = 1.4, th.ic = 0.9, th.ic.lin = 1.25,
lambda = 3, lambda.ic = 10, contrast = c("mean", "slope"), ht = FALSE,
scale = 3)A numeric vector containing the data in which you would like to find change-points.
A positive real number with default value equal to 1. It is
used to define the threshold, if the thresholding approach (explained in pcm_th)
is to be followed to detect the change-points in the scenario of piecewise-constant signals.
A positive real number with default value equal to 1.4. It is
used to define the threshold, if the thresholding approach (explained in cplm_th)
is to be followed to detect the change-points in the scenario of continuous, piecewise-linear signals.
A positive real number with default value equal to 0.9. It is
useful only if the model selection based Isolate-Detect method (described in
pcm_ic) is to be followed for the scenario of piecewise-constant signals.
It is used to define the threshold value that will be used at the first step (change-point
overestimation) of the model selection approach.
A positive real number with default value equal to 1.25. It is
useful only if the model selection based Isolate-Detect method (described in
cplm_ic) is to be followed for the scenario of continuous, piecewise-linear signals.
It is used to define the threshold value that will be used at the first step (change-point
overestimation) of the model selection approach.
A positive integer with default value equal to 3. It is used only when the threshold based approach is to be followed and it defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively.
A positive integer with default value equal to 10. It is used only when the information criterion based approach is to be followed and it defines the distance between two consecutive end- or start-points of the right- or left-expanding intervals, respectively.
A character string, which defines the type of the contrast function to
be used in the Isolate-Detect algorithm. If contrast = ``mean'', then the algorithm
looks for changes in a piecewise-constant signal. If contrast = ``slope'',
then the algorithm looks for changes in a continuous, piecewise-linear signal.
A logical variable with default value equal to FALSE. If FALSE,
the noise is assumed to follow the Gaussian distribution. If TRUE, then the
noise is assumed to follow a distribution that has tails heavier than those of the
Gaussian distribution.
A positive integer number with default value equal to 3. It is
used to define the way we pre-average the given data sequence only if
ht = TRUE. See the Details in ht_ID_pcm for more information on
how we pre-average.
A list with the following components:
cpt |
|
| A vector with the detected change-points. | no_cpt |
| The number of change-points detected. | fit |
| A numeric vector with the estimated signal. |
The data points provided in xd are assumed to follow $$X_t = f_t + \sigma\epsilon_t; t = 1,2,...,T,$$
where \(T\) is the total length of the data sequence, \(X_t\) are the observed
data, \(f_t\) is a one-dimensional, deterministic signal with abrupt structural
changes at certain points, and \(\epsilon_t\) are independent and identically
distributed random variables with mean zero and variance one. In this function,
the following scenarios for \(f_t\) are implemented.
Piecewise-constant signal with Gaussian noise.
Use contrast = ``mean'' and ht = FALSE here.
Piecewise-constant signal with heavy-tailed noise.
Use contrast = ``mean'' and ht = TRUE here.
Continuous, piecewise-linear signal with Gaussian noise.
Use contrast = ``slope'' and ht = FALSE here.
Continuous, piecewise-linear signal with heavy-tailed noise.
Use contrast = ``slope'' and ht = TRUE here.
In the case where ht = FALSE: the function firstly detects the change-points using
win_pcm_th (for the case of piecewise-constant signal) or win_cplm_th
(for the case of continuous, piecewise-linear signal). If the estimated number of change-points
is greater than 100, then the result is returned and we stop. Otherwise, ID proceeds
to detect the change-points using pcm_ic (for the case of piecewise-constant signal)
or cplm_ic (for the case of continuous, piecewise-linear signal) and this is what is
returned.
In the case where ht = TRUE: First we pre-average the given data sequence using normalise
and then, on the obtained data sequence, we follow exactly the same procedure as the one when ht = FALSE
above.
More details can be found in ``Detecting multiple generalized change-points by isolating single ones'',
Anastasiou and Fryzlewicz (2018), preprint.
ID_pcm, ID_cplm, ht_ID_pcm, and
ht_ID_cplm, which are the functions that are employed
in ID, depending on which scenario is imposed by the input arguments.
# NOT RUN {
single.cpt.mean <- c(rep(4,3000),rep(0,3000))
single.cpt.mean.normal <- single.cpt.mean + rnorm(6000)
single.cpt.mean.student <- single.cpt.mean + rt(6000, df = 5)
cpt.single.mean.normal <- ID(single.cpt.mean.normal)
cpt.single.mean.student <- ID(single.cpt.mean.student, ht = TRUE)
single.cpt.slope <- c(seq(0, 1999, 1), seq(1998, -1, -1))
single.cpt.slope.normal <- single.cpt.slope + rnorm(4000)
single.cpt.slope.student <- single.cpt.slope + rt(4000, df = 5)
cpt.single.slope.normal <- ID(single.cpt.slope.normal, contrast = "slope")
cpt.single.slope.student <- ID(single.cpt.slope.student, contrast = "slope", ht = TRUE)
# }
Run the code above in your browser using DataLab