Implements the BARD (Bayesian Abnormal Region Detector) procedure of Bardwell and Fearnhead (2017). BARD is a fully Bayesian inference procedure which is able to give measures of uncertainty about the number and location of anomalous regions. It uses negative binomial prior distributions on the lengths of anomalous and non-anomalous regions as well as a uniform prior for the means of anomalous regions. Inference is conducted by solving a set of recursions. To reduce computational and storage costs a resampling step is included.
bard(
x,
p_N = 1/(nrow(x) + 1),
p_A = 5/nrow(x),
k_N = 1,
k_A = (5 * p_A)/(1 - p_A),
pi_N = 0.9,
paffected = 0.05,
lower = 2 * sqrt(log(nrow(x))/nrow(x)),
upper = max(x),
alpha = 1e-04,
h = 0.25
)
An instance of the S4 object of type .bard.class
containing the data x
, procedure parameter values, and the results.
A numeric matrix with n rows and p columns containing the data which is to be inspected. The time series data classes ts, xts, and zoo are also supported.
Hyper-parameter of the negative binomial distribution for the length of non-anomalous segments (probability of success). Defaults to \(\frac{1}{n+1}.\)
Hyper-parameter of the negative binomial distribution for the length of anomalous segments (probability of success). Defaults to \(\frac{5}{n}.\)
Hyper-parameter of the negative binomial distribution for the length of non-anomalous segments (size). Defaults to 1.
Hyper-parameter of the negative binomial distribution for the length of anomalous segments (size). Defaults to \(\frac{5p_A}{1- p_A}.\)
Probability that an anomalous segment is followed by a non-anomalous segment. Defaults to 0.9.
Proportion of the variates believed to be affected by any given anomalous segment. Defaults to 5%. This parameter is relatively robust to being mis-specified and is studied empirically in Section 5.1 of bardwell2017;textualanomaly.
The lower limit of the the prior uniform distribution for the mean of an anomalous segment \(\mu\). Defaults to \(2\sqrt{\frac{\log(n)}{n}}.\)
The upper limit of the prior uniform distribution for the mean of an anomalous segment \(\mu\). Defaults to the largest value of x.
Threshold used to control the resampling in the approximation of the posterior distribution at each time step. A sensible default is 1e-4. Decreasing alpha increases the accuracy of the posterior distribution but also increases the computational complexity of the algorithm.
The step size in the numerical integration used to find the marginal likelihood.
The quadrature points are located from lower
to upper
in steps of h
. Defaults to 0.25.
Decreasing this parameter increases the accuracy of the calculation for the marginal likelihood but increases computational complexity.
This function gives certain default hyper-parameters for the two segment length distributions. We chose these to be quite flexible for a range of problems. For non-anomalous segments a geometric distribution was selected having an average segment length of \(n\) with the standard deviation being of the same order. For anomalous segments we chose parameters that gave an average length of 5 and a variance of \(n\). These may not be suitable for all problems and the user is encouraged to tune these parameters.
bardwell2017anomaly
JSS-anomaly-paper-finalanomaly
sampler
library(anomaly)
data(simulated)
# run bard
bard.res<-bard(sim.data, alpha = 1e-3, h = 0.5)
sampler.res<-sampler(bard.res)
collective_anomalies(sampler.res)
# \donttest{
plot(sampler.res,marginals=TRUE)
# }
Run the code above in your browser using DataLab