Learn R Programming

anomaly (version 4.0.2)

scapa.mv: Online detection of multivariate anomalous segments and points using SMVCAPA.

Description

This function implements SMVCAPA from Fisch et al. (2019) in an as-if-online way. It detects potentially lagged collective anomalies as well as point anomalies in streaming data. The runtime scales linearly (up to logarithmic factors) in ncol(x), max_lag, and max_seg_len. This version of capa.uv has a default value transform=tierney which uses sequential estimates for transforming the data prior to analysis. It also returns an S4 class which allows the results to be postprocessed as if the data had been analysed in an online fashion.

Usage

scapa.mv(
  x,
  beta = NULL,
  beta_tilde = NULL,
  type = "meanvar",
  min_seg_len = 10,
  max_seg_len = Inf,
  max_lag = 0,
  transform = tierney
)

Arguments

x

A numeric matrix with n rows and p columns containing the data which is to be inspected. The time series data classes ts, xts, and zoo are also supported.

beta

A numeric vector of length p, giving the marginal penalties. If type ="meanvar" or if type = "mean" and maxlag > 0 it defaults to the penalty regime 2' described in Fisch, Eckley and Fearnhead (2019). If type = "mean" and maxlag = 0 it defaults to the pointwise minimum of the penalty regimes 1, 2, and 3 in Fisch, Eckley and Fearnhead (2019).

beta_tilde

A numeric constant indicating the penalty for adding an additional point anomaly. It defaults to 3log(np), where n and p are the data dimensions.

type

A string indicating which type of deviations from the baseline are considered. Can be "meanvar" for collective anomalies characterised by joint changes in mean and variance (the default), "mean" for collective anomalies characterised by changes in mean only, or "robustmean" for collective anomalies characterised by changes in mean only which can be polluted by outliers.

min_seg_len

An integer indicating the minimum length of epidemic changes. It must be at least 2 and defaults to 10.

max_seg_len

An integer indicating the maximum length of epidemic changes. It must be at least the min_seg_len and defaults to Inf.

max_lag

A non-negative integer indicating the maximum start or end lag. Default value is 0.

transform

A function used to transform the data prior to analysis by scapa.mv. This can, for example, be used to compensate for the effects of autocorrelation in the data. Importantly, the untransformed data remains available for post processing results obtained using scapa.mv. The package includes a method which can be used for the transform, (see tierney, the default), but a user defined (ideally sequential) function can be specified.

Value

An S4 class of type scapa.mv.class.

References

2019MVCAPAanomaly

alex2020realanomaly

Examples

Run this code
# NOT RUN {
library(anomaly)

### generate some multivariate data

set.seed(2018)
x1 = rnorm(500)
x2 = rnorm(500)
x3 = rnorm(500)
x4 = rnorm(500)

### Add two (lagged) collective anomalies

x1[151:200] = x1[151:200]+2
x2[171:200] = x2[171:200]+2
x3[161:190] = x3[161:190]-3

x1[351:390] = x1[371:390]+2
x3[351:400] = x3[351:400]-3
x4[371:400] = x4[371:400]+2

### Add point anomalies

x4[451] = x4[451]*max(1,abs(1/x4[451]))*5
x4[100] = x4[100]*max(1,abs(1/x4[100]))*5
x2[050] = x2[050]*max(1,abs(1/x2[050]))*5

my_x = cbind(x1,x2,x3,x4)

### Now apply MVCAPA

res<-scapa.mv(my_x,max_lag=20,type="mean")

### Examine the output at different times and see how the results are updated:

plot(res,epoch=155)
plot(res,epoch=170)
plot(res,epoch=210)

# }

Run the code above in your browser using DataLab