# diss.PRED

##### Dissimilarity Measure Based on Nonparametric Forecast

Computes the dissimilarity between two time series as the L1 distance between the kernel estimators of their forecast densities at a pre-specified horizon.

##### Usage

```
diss.PRED(x, y, h, B=500, logarithm.x=FALSE, logarithm.y=FALSE,
differences.x=0, differences.y=0, plot=FALSE)
```

##### Arguments

- x
Numeric vector containing the first of the two time series.

- y
Numeric vector containing the second of the two time series.

- h
The horizon of interest, i.e the number of steps-ahead where the prediction is evaluated.

- B
The amount of bootstrap resamples.

- logarithm.x
Boolean. Specifies whether to transform series x by taking logarithms or not. When using

`diss`

wrapper, use`logarithms`

argument instead. See details.- logarithm.y
Boolean. Specifies whether to transform series y by taking logarithms or not. When using

`diss`

wrapper, use`logarithms`

argument instead. See details.- differences.x
Specifies the amount of differences to apply to series x. When using

`diss`

wrapper, use`differences`

argument instead. See details.- differences.y
Specifies the amount of differences to apply to series y. When using

`diss`

wrapper, use`differences`

argument instead. See details.- plot
If

`TRUE`

, plot the resulting forecast densities.

##### Details

The dissimilarity between the time series `x`

and `y`

is given by $$ d(x,y) = \int{ | f_{x,h}(u) - f_{y,h}(u) | du} $$ where \(f_{x,h}\) and \(f_{y,h}\) are kernel density estimators of the forecast densities h-steps ahead of `x`

and `y`

, respectively. The horizon of interest h is pre-specified by the user.
The kernel density estimators are based on B bootstrap replicates obtained by using a resampling procedure that mimics the generating processes, which are assumed to follow an arbitrary autoregressive structure (parametric or non-parametric). The procedure is completely detailed in Vilar et al. (2010). This function has high computational cost due to the bootstrapping procedure.

The procedure uses a bootstrap method that requires stationary time series. In order to support a wider range of time series, the method allows some transformations on the series before proceeding with the bootstrap resampling. This transformations are inverted before calculating the densities. The transformations allowed are logarithm and differenciation.
The parameters `logarithm.x`

, `logarithm.y`

, `differences.x`

, `differences.y`

can be specified with this purpose.

If using `diss`

function with "PRED" `method`

, the argument `logarithms`

must be used instead of `logarithm.x`

and `logarithm.y`

. `logarithms`

is a boolean vector specifying if the logarithm transform should be taken for each one of the `series`

. The argument `differences`

, a numeric vector specifying the amount of differences to apply the `series`

, is used instead of `differences.x`

and `differences.y`

. The plot is also different, showing all the densities in the same plot.

##### Value

`diss.PRED`

returns a list with the following components.

The computed distance.

A 2-column matrix with the density of predicion of series `x`

. First column is the base (x) and the second column is the value (y) of the density.

A 2-column matrix with the density of predicion of series `y`

. First column is the base (x) and the second column is the value (y) of the density.

When used from the diss wrapper function, it returns a list with the following components.

A `dist`

object with the pairwise L1 distances between series.

A list of 2-column matrices containing the densities of each series, in the same format as 'dens.x' or 'dens.y' of `diss.PRED`

.

##### References

Alonso, A.M., Berrendero, J.R., Hernandez, A. and Justel, A. (2006) Time series clustering based on forecast densities. *Comput. Statist. Data Anal.*, **51**,762--776.

Vilar, J.A., Alonso, A. M. and Vilar, J.M. (2010) Non-linear time series clustering based on non-parametric forecast densities. *Comput. Statist. Data Anal.*, **54 (11)**, 2850--2865.

Montero, P and Vilar, J.A. (2014) *TSclust: An R Package for Time Series Clustering.* Journal of Statistical Software, 62(1), 1-43. http://www.jstatsoft.org/v62/i01/.

##### See Also

##### Examples

```
# NOT RUN {
x <- (rnorm(100))
x <- x + abs(min(x)) + 1 #shift to produce values greater than 0, for a correct logarithm transform
y <- (rnorm(100))
z <- sin(seq(0, pi, length.out=100))
## Compute the distance and check for coherent results
diss.PRED(x, y, h=6, logarithm.x=FALSE, logarithm.y=FALSE, differences.x=1, differences.y=0)
#create a dist object for its use with clustering functions like pam or hclust
diss( rbind(x,y,z), METHOD="PRED", h=3, B=200,
logarithms=c(TRUE,FALSE, FALSE), differences=c(1,1,2) )
# }
```

*Documentation reproduced from package TSclust, version 1.2.4, License: GPL-2*