AnomalyDetection (version 1.0)

AnomalyDetectionTs: Anomaly Detection Using Seasonal Hybrid ESD Test

Description

A technique for detecting anomalies in seasonal univariate time series where the input is a series of <timestamp, count> pairs.

Usage

AnomalyDetectionTs(x, max_anoms = 0.1, direction = "pos", alpha = 0.05,
  only_last = NULL, threshold = "None", e_value = F, longterm = F,
  plot = F, y_log = F, xlabel = "", ylabel = "count", title = NULL)

Arguments

x

Time series as a two column data frame where the first column consists of the timestamps and the second column consists of the observations.

max_anoms

Maximum number of anomalies that S-H-ESD will detect as a percentage of the data.

direction

Directionality of the anomalies to be detected. Options are: 'pos' | 'neg' | 'both'.

alpha

The level of statistical significance with which to accept or reject anomalies.

only_last

Find and report anomalies only within the last day or hr in the time series. NULL | 'day' | 'hr'.

threshold

Only report positive going anoms above the threshold specified. Options are: 'None' | 'med_max' | 'p95' | 'p99'.

e_value

Add an additional column to the anoms output containing the expected value.

longterm

Increase anom detection efficacy for time series that are greater than a month. See Details below.

plot

A flag indicating if a plot with both the time series and the estimated anoms, indicated by circles, should also be returned.

y_log

Apply log scaling to the y-axis. This helps with viewing plots that have extremely large positive anomalies relative to the rest of the data.

xlabel

X-axis label to be added to the output plot.

ylabel

Y-axis label to be added to the output plot.

title

Title for the output plot.

Value

The returned value is a list with the following components.

anoms

Data frame containing timestamps, values, and optionally expected values.

plot

A graphical object if plotting was requested by the user. The plot contains the estimated anomalies annotated on the input time series.

One can save anoms to a file in the following fashion: write.csv(<return list name>[["anoms"]], file=<filename>)

One can save plot to a file in the following fashion: ggsave(<filename>, plot=<return list name>[["plot"]])

Details

longterm This option should be set when the input time series is longer than a month. The option enables the approach described in Vallis, Hochenbaum, and Kejariwal (2014). threshold Filter all negative anomalies and those anomalies whose magnitude is smaller than one of the specified thresholds which include: the median of the daily max values (med_max), the 95th percentile of the daily max values (p95), and the 99th percentile of the daily max values (p99).

References

Vallis, O., Hochenbaum, J. and Kejariwal, A., (2014) "A Novel Technique for Long-Term Anomaly Detection in the Cloud", 6th USENIX, Philadelphia, PA.

Rosner, B., (May 1983), "Percentage Points for a Generalized ESD Many-Outlier Procedure" , Technometrics, 25(2), pp. 165-172.

See Also

AnomalyDetectionVec

Examples

Run this code
# NOT RUN {
data(raw_data)
AnomalyDetectionTs(raw_data, max_anoms=0.02, direction='both', plot=TRUE)
# To detect only the anomalies on the last day, run the following:
AnomalyDetectionTs(raw_data, max_anoms=0.02, direction='both', only_last="day", plot=TRUE)
# }

Run the code above in your browser using DataLab