Learn R Programming

oce (version 0.1-81)

despike: Remove spikes from a time series

Description

Remove spikes from a time series

Usage

despike(x, method=1, n=4, k=7, physical.range)

Arguments

x
a vector
method
number indicating the method; see Details.
n
number of standard-deviation increments to tolerate
k
length of running median used in algorithm
physical.range
optional two-element vector holding the smallest physically-realistic value and the highest one. (For example, for water temperature, one might use c(-3,101).)

Value

  • A new vector that is identical to the original one, except that spikes and unphysical values are replace with NA.

Details

The method identifies spikes by statistical deviation from a smoothed form of the series.

The first step is to construct gapless, physically-realistic series that has no missing values, and no values outside the physical range (if physical.range is given). All such values are replaced with the overall median.

The next step is to create a smoothed version of this series. If method=1, a running median is used, calculated with runmed, with the running length given by k. If method=2, smooth is used.

Finally, the difference between this smoothed series and the gapless, physically-realistic series is calculated. Any spots at which this difference exceeds its mean value by n standard deviations are flagged as spikes.

Examples

Run this code
n <- 100
x <- 1:n
y <- rnorm(n=n)
y[n/2] <- 10                    # 10 standard deviations
yy <- despike(y)
plot(x, y, type='l')
spike <- is.na(yy)
points(x[spike], y[spike], col="red", cex=3)

Run the code above in your browser using DataLab