washeR (version 0.1.3)

wash.out: Outlier detection for single or grouped time series

Description

This function provides anomaly signals (even a graphical visualization) when there is a 'jump' in a single time series, or the 'jump' is too much different respect those ones of grouped similar time series.

Usage

wash.out(
  dati,
  graph = FALSE,
  linear_analysis = FALSE,
  val_test_limit = 5,
  save_out = FALSE,
  out_out = "out.csv",
  pdf_out = "out.pdf",
  r_out = 3,
  c_out = 2,
  first_line = 1,
  pace_line = 6
)

Value

Data frame of possible outliers in a triad. Output record: rows/time.2/series/y1/y2/y3/test(AV)/AV/ n/median(AV)/mad(AV)/madindex(AV). Where time.2 is the center of the triad y1, y2, y3; test(AV) is the number to compare with 5 to detect outlier; n is the number of observations of the group ....

Arguments

dati

data frame (grouped time series: phenomenon+date+group+values) or vector (single time series)

graph

logical value for graphical analysis (default=FALSE)

linear_analysis

logical value for linear analysis (default=FALSE)

val_test_limit

value for outlier detection sensitiveness (default=5 ; max=10)

save_out

logical value for saving detected outliers (default=FALSE)

out_out

a character file name for saving outliers in csv form, delimited with ";" and using ',' as decimal separator (default out.csv)

pdf_out

a character file name for saving graphic analysis in pdf file (default=out.pdf)

r_out

rows number of graphs (default=3)

c_out

cols number of graphs (default=2)

first_line

value for first dotted line in graphic analysis (default=1)

pace_line

value for pace in dotted line in graphic analysis (default=6)

Examples

Run this code
## we can start with data without outliers but structured with co-movement between groups
data("dati")
## first column for phenomenon
## 2° col for time written in ordered numbers or strings
## 3° col for group classification variable
## 4° col for values
str(dati)
#######################################
## a data frame without any outlier
#######################################
out=wash.out(dati)
out   ## empity data frame
length(out[,1])  ## no row
## we can add two outliers
####  time=3 temperature value=0
dati[99,4]=  0
## ... and then for "rain" phenomenon!
####  time=3 rain value=37
dati[118,4]=  37
#######################################
##   data.frame with 2 fresh outliers
#######################################
out=wash.out(dati)
##  all "three terms" time series
## let's take a look at anomalous time series
out
## ... the same but we save results in a file....
## If we don't specify a name, out.csv  is the default
out=wash.out(dati,save_out=TRUE,out_out="tabel_out.csv")
out
## we put the parameter from 5 to 10, using this upper one  to capture
##       only  particularly anomalous outliers
out=wash.out(dati, val_test_limit = 10)
out
## save plots and outliers in a pdf file "out.pdf" as a default
out=wash.out(dati, val_test_limit = 10, graph=TRUE)
out
## we can make the usual analysis for groups but we can also use that one
## reserved for every single time series
## (linear_analysis): two files for saved outliers (out.csv and linout.csv)
##  and for graph display in two pdf files (out.pdf and linout.pdf)
out=wash.out(dati,val_test_limit=5,save_out=TRUE,linear_analysis=TRUE,graph=TRUE)
out
## out return only the linear analysis...
## ... in this case we lose the co-movement information an we run the risk
##     of finding too much variance in a single time series
##     and detecting not too much likely outliers
##########################################################
##  single time series analysis
##########################################################
data(ts)
str(ts)
sts= ts$dati
plot(sts,type="b",pch=20,col="red")
## a time series with a variability and an increasing trend
## sts is a vector and linear analysis is the default one
out=wash.out(sts)
out
## we find no outlier
out=wash.out(sts,val_test_limit=5,linear_analysis=TRUE,graph=TRUE)
out
## no outlier
## We can add an outlier with limited amount
sts[5]=sts[5]*2
plot(sts,type="b",pch=20,col="red")
out=wash.out(sts,val_test_limit=5)
out
## test is over 5 for a bit
out=wash.out(sts,val_test_limit=5,save_out=TRUE,graph=TRUE)
out
data(ts)
sts= ts$dati
sts[5]=sts[5]*3
## we can try a greater value to put an outlier of a certain importance
plot(sts,type="b",pch=20,col="blue")
out=wash.out(sts,val_test_limit=5,save_out=TRUE,graph=TRUE)
out
## washer procedure identify three triads of outliers values
system("rm *.csv *.pdf")

Run the code above in your browser using DataLab