diss.CID: Complexity-Invariant Distance Measure For Time Series

Description

Computes the distance based on the Euclidean distance corrected by the complexity estimation of the series.

Usage

diss.CID(x, y)

Arguments

Numeric vector containing the first of the two time series.

Numeric vector containing the second of the two time series.

Value

The computed dissimilarity.

Details

This distance is defined $$CID(x,y) = ED(x,y) \times CF(x,y)$$ where $CF(x,y)$ is a complexity correction factor defined as: $$ max(CE(x), CE(y)) / min(CE(x), CE(y)) $$ and $CE(x)$ is a compexity estimate of a time series $x$. diss.CID therefore increases the distance between series with different complexities. If the series have the same complexity estimate, the distance defenerates Euclidean distance. The complexity is defined in diss.CID as: $$ CE(x) = \sqrt{ \sum_{t=1} (x_{t+1} - x_t)^2 } $$

References

Batista, G. E., Wang, X., & Keogh, E. J. (2011). A Complexity-Invariant Distance Measure for Time Series. In SDM (Vol. 31, p. 32).

Montero, P and Vilar, J.A. (2014) TSclust: An R Package for Time Series Clustering. Journal of Statistical Software, 62(1), 1-43. http://www.jstatsoft.org/v62/i01/.

Examples

Run this code

# NOT RUN {
n = 100
x <- rnorm(n)  #generate sample series, white noise and a wiener process
y <- cumsum(rnorm(n))

diss.CID(x, y)

z <- rnorm(n)
w <- cumsum(rnorm(n))
series = rbind(x, y, z, w)
diss(series, "CID")


# }

Run the code above in your browser using DataLab