Learn R Programming

cpss (version 0.0.2)

cpss.lm: Detecting changes in linear models

Description

Detecting changes in linear models

Usage

cpss.lm(
  formula,
  data = NULL,
  algorithm = "BS",
  dist_min = floor(log(n)),
  ncps_max = ceiling(n^0.4),
  pelt_pen_val = NULL,
  pelt_K = 0,
  wbs_nintervals = 500,
  criterion = "CV",
  times = 2
)

Value

cpss.lm returns an object of an S4 class, called "cpss", which collects data and information required for further change-point analyses and summaries. See cpss.custom.

Arguments

formula

a formula object describing the change-point model to be fitted.

data

an optional data frame, list or environment containing the variables in the model.

algorithm

a character string specifying the change-point searching algorithm, one of four state-of-the-art candidates "SN" (segment neighborhood), "BS" (binary segmentation), "WBS" (wild binary segmentation) and "PELT" (pruned exact linear time) algorithms.

dist_min

an integer indicating the minimum distance between two successive candidate change-points, with a default value \(floor(log(n))\).

ncps_max

an integer indicating the maximum number of change-points searched for, with a default value \(ceiling(n^0.4)\).

pelt_pen_val

a numeric vector specifying the collection of candidate values of the penalty if the "PELT" algorithm is used.

pelt_K

a numeric value to adjust the pruning tactic, usually is taken to be 0 if negative log-likelihood is used as a cost; more details can be found in Killick et al. (2012).

wbs_nintervals

an integer indicating the number of random intervals drawn in the "WBS" algorithm and a default value 500 is used.

criterion

a character string indicating which model selection criterion, "cross- validation" ("CV") or "multiple-splitting" ("MS"), is used.

times

an integer indicating how many times of sample-splitting should be performed; if "CV" criterion is used, it should be set as 2.

References

Killick, R., Fearnhead, P., and Eckley, I. A. (2012). Optimal Detection of Changepoints With a Linear Computational Cost. Journal of the American Statistical Association, 107(500):1590–1598.

See Also

cpss.glm

Examples

Run this code
library("cpss")
set.seed(666)
n <- 400
tau <- c(80, 200, 300)
tau_ext <- c(0, tau, n)
be <- list(c(0, 1), c(1, 0.5), c(0, 1), c(-1, 0.5))
seg_len <- diff(c(0, tau, n))
x <- rnorm(n)
mu <- lapply(seq(1, length(tau) + 1), function(k) {
  be[[k]][1] + be[[k]][2] * x[(tau_ext[k] + 1):tau_ext[k + 1]]
})
mu <- do.call(c, mu)
sig <- unlist(lapply(seq(1, length(tau) + 1), function(k) {
  rep(be[[k]][2], seg_len[k])
}))
y <- rnorm(n, mu, sig)
res <- cpss.lm(
  formula = y ~ x,
  algorithm = "BS",
  dist_min = 5, ncps_max = 10
)
summary(res)
# 80  202  291
coef(res)
# $coef
#             [,1]      [,2]        [,3]       [,4]
# [1,] -0.00188792 1.0457718 -0.03963209 -0.9444813
# [2,]  0.91061557 0.6291965  1.20694409  0.4410036
#
# $sigma
# [1] 0.8732233 0.4753216 0.9566516 0.4782329

Run the code above in your browser using DataLab