Computes the multiscale regression estimator, see (3.1) in the vignette, and allows for confidence statements, see section 3 in the vignette. It implements the estimators SMUCE and HSMUCE as well as their confidence intervals and bands.
If q == NULL
a Monte-Carlo simulation is required for computing critical values. Since a Monte-Carlo simulation lasts potentially much longer (up to several hours or days if the number of observations is in the millions) than the main calculations, this package saves them by default in the workspace and on the file system such that a second call requiring the same Monte-Carlo simulation will be much faster. For more details, in particular to which arguments the Monte-Carlo simulations are specific, see Section Storing of Monte-Carlo simulations below. Progress of a Monte-Carlo simulation can be reported by the argument messages
and the saving can be controlled by the argument option
, both can be specified in …
and are explained in monteCarloSimulation
and critVal
, respectively.
stepFit(y, q = NULL, alpha = NULL, x = 1:length(y), x0 = 2 * x[1] - x[2],
family = NULL, intervalSystem = NULL, lengths = NULL, confband = FALSE,
jumpint = confband, ...)
a numeric vector containing the observations
either NULL
, then the vector of critical values at level alpha
will be computed from a Monte-Carlo simulation, or a numeric giving the global quantile or a numeric vector giving the vector of critical values. Either q
or alpha
must be given. Otherwise, alpha == 0.5
is chosen with a warning
. This argument will be passed to critVal
to obtain the needed critical values. Additional parameters for the computation of q
can be specified in …
, for more details see the documentation of critVal
. Please note that by default the Monte-Carlo simulation will be saved in the workspace and on the file system, for more details see Section Storing of Monte-Carlo simulations below
a probability, i.e. a single numeric between 0 and 1, giving the significance level. Its choice is a trade-off between data fit and parsimony of the estimator. In other words, this argument balances the risks of missing change-points and detecting additional artefacts. For more details on this choice see (Frick et al., 2014, section 4) and (Pein et al., 2017, section 3.4). Either q
or alpha
must be given. Otherwise, alpha == 0.5
is chosen with a warning
a numeric vector of the same length as y
containing the corresponding sample points
a single numeric giving the last unobserved sample point directly before sampling started
a string specifying the assumed parametric family, for more details see parametricFamily, currently "gauss"
, "hsmuce"
and "mDependentPS"
are supported. By default (NULL
) "gauss"
is assumed
a string giving the used interval system, either "all"
for all intervals, "dyaLen"
for all intervals of dyadic length or "dyaPar"
for the dyadic partition, for more details see intervalSystem. By default (NULL
) the default interval system of the specified parametric family will be used, which one this will be is described in parametricFamily
an integer vector giving the set of lengths, i.e. only intervals of these lengths will be considered. Note that not all lengths are possible for all interval systems and for all parametric families, see intervalSystem and parametricFamily, respectively, to see which ones are allowed. By default (NULL
) all lengths that are possible for the specified intervalSystem
and for the specified parametric family will be used
single logical
, indicates if a confidence band for the piecewise-continuous function should be computed
single logical
, indicates if confidence sets for change-points should be computed
there are two groups of further arguments:
further parameters of the parametric family. Depending on argument family
some might be required, but others might be optional, please see parametricFamily for more details,
further parameters that will be passed to critVal
. critVal
will be called automatically with the number of observations n = length(y)
, the arguments family
, intervalSystem
, lengths
, q
and output
set. For these arguments no user interaction is required and possible, all other arguments of critVal
can be passed additionally
An object of class stepfit
that contains the fit. If jumpint == TRUE
function jumpint
allows to extract the 1 - alpha
confidence interval for the jumps. If confband == TRUE
function confband
allows to extract the 1 - alpha
confidence band.
If q == NULL
a Monte-Carlo simulation is required for computing critical values. Since a Monte-Carlo simulation lasts potentially much longer (up to several hours or days if the number of observations is in the millions) than the main calculations, this package offers multiple possibilities for saving and loading the simulations. Progress of a simulation can be reported by the argument messages
which can be specified in …
and is explained in the documentation of monteCarloSimulation
. Each Monte-Carlo simulation is specific to the number of observations, the parametric family (including certain parameters, see parametricFamily) and the interval system, and for simulations of class "MCSimulationMaximum"
, additionally, to the set of lengths and the used penalty. Monte-Carlo simulations can also be performed for a (slightly) larger number of observations nq
in …
and explained in the documentation of critVal
, which avoids extensive resimulations for only a little bit varying number of observations. Simulations can either be saved in the workspace in the variable critValStepRTab
or persistently on the file system for which the package R.cache
is used. Moreover, storing in and loading from variables and RDS files is supported. Finally, a pre-simulated collection of simulations can be accessed by installing the package stepRdata
available from http://www.stochastik.math.uni-goettingen.de/stepRdata_1.0-0.tar.gz. The simulation, saving and loading can be controlled by the argument option
which can be specified in …
and is explained in the documentation of critVal
. By default simulations will be saved in the workspace and on the file system. For more details and for how simulation can be removed see Section Simulating, saving and loading of Monte-Carlo simulations in critVal
.
Frick, K., Munk, A., Sieling, H. (2014) Multiscale change-point inference. With discussion and rejoinder by the authors. Journal of the Royal Statistical Society, Series B 76(3), 495--580.
Pein, F., Sieling, H., Munk, A. (2017) Heterogeneous change point inference. Journal of the Royal Statistical Society, Series B, 79(4), 1207--1227.
critVal
, penalty
, parametricFamily
, intervalSystem
, monteCarloSimulation
# NOT RUN {
# generate random observations
y <- c(rnorm(50), rnorm(50, 1))
x <- seq(0.01, 1, 0.01)
plot(x, y, pch = 16, col = "grey30", ylim = c(-3, 4))
# computation of SMUCE and its confidence statements
fit <- stepFit(y, x = x, alpha = 0.5, jumpint = TRUE, confband = TRUE)
lines(fit, lwd = 3, col = "red", lty = "22")
# confidence intervals for the change-point locations
points(jumpint(fit), col = "red")
# confidence band
lines(confband(fit), lty = "22", col = "darkred", lwd = 2)
# higher significance level for larger detection power, but less confidence
stepFit(y, x = x, alpha = 0.99, jumpint = TRUE, confband = TRUE)
# smaller significance level for the small risk that the number of
# change-points is overestimated with probability not more than 5%,
# but smaller detection power
stepFit(y, x = x, alpha = 0.05, jumpint = TRUE, confband = TRUE)
# }
# NOT RUN {
# different interval system, lengths, penalty and given parameter sd
stepFit(y, x = x, alpha = 0.5, intervalSystem = "dyaLen",
lengths = c(1L, 2L, 4L, 8L), penalty = "weights",
weights = c(0.4, 0.3, 0.2, 0.1), sd = 0.5,
jumpint = TRUE, confband = TRUE)
# }
# NOT RUN {
# with given q
identical(stepFit(y, x = x, q = critVal(100L, alpha = 0.5),
jumpint = TRUE, confband = TRUE), fit)
identical(stepFit(y, x = x, q = critVal(100L, alpha = 0.5, output = "value"),
jumpint = TRUE, confband = TRUE), fit)
# }
# NOT RUN {
# the above calls saved and (attempted to) load Monte-Carlo simulations and
# simulated them for nq = 128 observations
# in the following call no saving, no loading and simulation for n = 100
# observations is required, progress of the simulation will be reported
stepFit(y, x = x, alpha = 0.5, jumpint = TRUE, confband = TRUE,
messages = 1000L, options = list(simulation = "vector",
load = list(), save = list()))
# with given stat to compute q
stat <- monteCarloSimulation(n = 128L)
identical(stepFit(y, x = x, alpha = 0.5, stat = stat,
jumpint = TRUE, confband = TRUE),
stepFit(y, x = x, alpha = 0.5, jumpint = TRUE, confband = TRUE,
options = list(load = list())))
# }
Run the code above in your browser using DataLab