This function selects the number of breakpoints of the segmented relationship according to the BIC criterion or sequential hypothesis testing.
selgmented(olm, seg.Z, alpha = 0.05, type = c("score", "davies", "bic", "aic"),
control = seg.control(), refit=TRUE, stop.if=6, return.fit = TRUE,
bonferroni = FALSE, Kmax = 2, msg = TRUE, plot.ic = FALSE, th = NULL)
The returned object depends on argument return.fit
. If FALSE
, the returned object is a list with some information on the compared models (i.e. the BIC values), otherwise a classical segmented object with the component selection.psi
including the aforementioned information. See segmented
for details.
A starting lm
or glm
object or a simple numerical vector meaning the response variable.
A one-side formula for the segmented variable. Only one term can be included, and it can be omitted if olm
includes just one covariate.
The fixed type I error probability.
Which criterion should be used? Options score
and davies
allow to carry out sequential hypothesis testing with no more than 2 breakpoints (Kmax=2
). Alternatively, the number of breakpoints can be selected via the BIC (or AIC) with virtually no upper bound for Kmax
.
See seg.control
.
If TRUE
, the final selected model is re-fitted using arguments in control
, typically with bootstrap restarting. Set refit=FALSE
to speed up computation (and possibly accepting near-optimal estimates). Ignored if type='score'
or type='davies'
.
An integer. If stop.if
fits provide higher AIC/BIC values the search is interrupted. Ignored if type='score'
or type='davies'
.
If TRUE
, the fitted model (with the number of breakpoints selected according to type
) is returned.
If TRUE
, the Bonferroni correction is employed, i.e. alpha/Kmax
is always taken as threshold value to reject or not. If FALSE
, alpha
is used in the second level of hypothesis testing.
The maximum number of breakpoints being tested. If type='bic'
or type='aic'
, any integer value can be specified, otherwise at most Kmax=2
breakpoints can be tested via the Score or Davies statistics.
If FALSE
the final fit is returned silently with the selected number of breakpoints, otherwise the message including information about the selection procedure (i.e. the BIC values) is printed.
If TRUE
the information criterion values with respect to the number of breakpoints are plotted. Ignored if type='score'
or type='davies'
.
When a large number of breakpoints is being tested, it could happen that 2 estimated breakpoints are too close each other, and only one can be retained. Thus if the difference between two breakpoints is less or equal to th
, one (the first) is deleted. Ignored if type='score'
or type='davies'
. Of course, th
depends on the x
scale: Integers, like 5 or 10, are appropriate if the covariate is the observation index. Default (NULL
) means th=diff(range(x))/100
.
Vito Muggeo
The function uses properly the functions segmented
, pscore.test
or davies.test
to select the 'optimal' number of breakpoints 0,1,...,Kmax
. If type='bic'
or 'aic'
, the procedure stops if the last stop.if
fits have increasing values of the information criterion.
Muggeo V (2020) Selecting number of breakpoints in segmented regression: implementation in the R package segmented https://www.researchgate.net/publication/343737604
segmented
, pscore.test
, davies.test