Last chance! 50% off unlimited learning
Sale ends in
despike(x, reference=c("median","smooth", "trim"), n=4, k=7, min, max,
replace=c("reference", "NA"))
x
and
the reference time series, used for reference="median"
or
reference="smooth"
; see reference="median"
, and
ignored for other values of reference
.x
, used with
reference="trim"
.x
, used with
reference="trim"
."reference"
indicating to replace them with the reference time
series, and "NA"
indicating to replace them with NA
.NA
according to the value of action
. For reference="median"
, the first step is to linearly interpolate
across any gaps, in which x==NA
. Then the reference time series is
constructed using runmed
as a running median of k
elements. Then, the standard deviation of the difference between x
and the reference is calculated. Any x
values that differ from the
reference by more than n
times this standard deviation are
considered to be spikes. If replace="reference"
, these x
values are replaced with the reference series, and the resultant time
series is returned. If replace="NA"
, the spikes are replaced with
NA
in the returned time series.
For reference="smooth"
, the processing is the same as for
"median"
, except that smooth
is used to calculate the
reference time series.
For reference="trim"
, the reference time series is constructed by
linear interpolation across any regions in which x
x>max
. In this case, the value of n
is ignored, and the
return value either uses the reference time series for spikes, or
NA
, according to the value of replace
.
n <- 50
x <- 1:n
y <- rnorm(n=n)
y[n/2] <- 10 # 10 standard deviations
plot(x, y, type='l')
lines(x, despike(y), col='red')
lines(x, despike(y, reference="smooth"), col='darkgreen')
lines(x, despike(y, reference="trim", min=-3, max=3), col='blue')
legend("topright", lwd=1, col=c("black", "red", "darkgreen", "blue"),
legend=c("raw", "median", "smooth", "trim"))
Run the code above in your browser using DataLab