Learn R Programming

Bagidis (version 1.0)

semimetric.BAGIDIS: Computing the BAGIDIS semidistance between series

Description

Functions for computing the Bagidis semidistance between series of measurements.

Usage

semimetric.BAGIDIS(DATA1,DATA2=DATA1, p = 2, wk=NULL, Param=0.5, wdw= min(ncol(DATA1),30), Evol =FALSE, Overlap = wdw-1, method = c('TS','BD'))
semimetric.BAGIDIS.TS(DATA1,DATA2=DATA1, p = 2, wk=NULL, Param=0.5, wdw= min(ncol(DATA1),30), Evol =FALSE, Overlap = wdw-1) semimetric.BAGIDIS.BD(Details1, Breakpoints1,Details2=Details1, Breakpoints2=Breakpoints1,NbSubseries =1,p = 2, wk=NULL, Param=0.5, Evol =FALSE)
BAGIDIS.dist(BUUHWE.out.1, BUUHWE.out.2, p = 2, wk=NULL, Param=0.5)
BAGIDIS.dist.BD(Details1, Breakpoints1,Details2, Breakpoints2, p = 2, wk=NULL, Param=0.5)

Arguments

DATA1
matrice containing the series to be compared row by row.with the rows of DATA2 Each row of DATA1 has its semidistance being computed with every row of DATA2. If only DATA1 is provided, then DATA2 is taken to be equal to DATA1. We must have ncol(DATA1)= ncol(DATA2).
DATA2
Optional. Matrice containing the series to be compared row by row.with the rows of DATA1 Each row of DATA1 has its semidistance being computed with every row of DATA2. If only DATA1 is provided, then DATA2 is taken to be equal to DATA1. We must have ncol(DATA1)= ncol(DATA2)
p
the kind of norm to be used for computing the partial distance in the B-D plane. Must be numeric or Inf.
wk
a vector of weights of length ncol(DATA1)-1. If not provided, wk = log(N+1-(1:N))/log(N+1) with N= ncol(DATA1)-1.
Param
the balance parameter between the differences along the breakpoint axis and along the detail axis. Param must be in [0;1]. Param= 1 means that only breakpoints differences are taken into account. Param=0 means that only details differences are taken into account.
wdw
In case distances are measured between "long" series, it could be advantageous to make use of a windowed semimetric. wdw encode the length of the window in which the semimetric will be computed between the subseries. By default there is no windowing if the length of the series ( = ncol(DATA1) ) is smaller than or equal to 30, and a windows length of 30 otherwise.
Evol
Logical. In case a windowing is applied, should the matrices of local (windowed) dissimilarities be returned? Default is FALSE.
Overlap
In case a windowing of the series is applied, Overlap determines how the subseries overlap each other. By default, a one-step-sliding distance is computed.
method
either 'TS' (default) or 'BD' : the method for computing the matrix of semi-distances in case of multiple series. Results are identical. 'TS' recompute the BUUHWE transform for each pairwise comparison, 'BD' computes all signatures beforehand and store them before computing the distances. 'TS' requires more time, 'BD' requires more storage. With method 'TS', computation time is affected by the number of rows in DATA1 and DATA2. If nrow(DATA1)> nrow(DATA2), it increases the number of operations to be computed. On the opposite, if nrow(DATA2)>nrow(DATA1), it increases the memory usage.
BUUHWE.out.1
BUUHWE expansion of a series, as obtained from function BUUHWE.
BUUHWE.out.2
BUUHWE expansion of a second series, as obtained from function BUUHWE.
Details1
matrixcontaining the details of series out of a dataset DATA1 containing a set of series of identical length.
Breakpoints1
matrix containing the breakpoints of series out of a dataset DATA1 containing a set of series of identical length.
Details2
matrixcontaining the details of series out of a dataset DATA2 containing a set of series of identical length as in DATA1.
Breakpoints2
matrix containing the breakpoints of series out of a dataset DATA2 containing a set of series of identical length as in DATA1.
NbSubseries
in case an evolving (windowed) semidistance must be computed, Nbsubseries gives the number of data measurements in a windowed segment.

Value

  • dissimilarity.matrix Matrix of semidistances between the nrow(DATA1) series of DATA1 and the nrow(DATA2) series of DATA2. Dimensions: nrow(DATA1) x nrow(DATA2) .
  • dissimilarity.evol Array of local matrices of semidistances between the windowed series of DATA1 and DATA2. Dimensions: nrow(DATA1) x nrow(DATA2) x Nb_subseries. Nb_subseries is determined by the three quantities nrow(DATA1), wdw and Overlap.

Details

Function semimetric.BAGIDIS computes the Bagidis semidistance between curves. If several curves are provided, it returns a matrix of semidistances. The function is an interface for either semimetric.BAGIDIS.TS or semimetric.BAGIDIS.BD, depending on the value of the parameter method.

Function Bagidis.dist computes the BAGIDIS semidistance between two series, encoded through their BUUHWE expansion obtained from function BUUHWE. Function Bagidis.dist.BD computes the BAGIDIS semidistance between two series, encoded through their breakpoints and details obtained from functions Breakpoints and Details.

See BAGIDIS-package for an overview about the BAGIDIS methodology and References for details, in particular Timmermans (2012), Chapter 1, and Timmermans and von Sachs (2010).

References

The main references are

  • Timmermans C., 2012, Bases Giving Distances. A new paradigm for investigating functional data with applications for spectroscopy. PhD thesis, Universite catholique de Louvain. http://hdl.handle.net/2078.1/112451
  • Timmermans C. and von Sachs R., 2015, A novel semi-distance for investigating dissimilarities of curves with sharp local patterns, Journal of Statistical Planning and Inference, 160, 35-50. http://hdl.handle.net/2078.1/154928
  • Fryzlewicz P. and Timmermans C., 2015, SHAH: Shape Adaptive Haar wavelets for image processing. Journal of Computational and Graphical Statistics. (accepted - published online 27 May 2015) http://stats.lse.ac.uk/fryzlewicz/shah/shah.pdf
  • Timmermans C., Delsol L. and von Sachs R., 2013, Using BAGIDIS in nonparametric functional data analysis: predicting from curves with sharp local features, Journal of Multivariate Analysis, 115, p. 421-444. http://hdl.handle.net/2078.1/118369

Other references include

  • Girardi M. and Sweldens W., 1997, A new class of unbalanced Haar wavelets that form an unconditional basis for Lp on general measure spaces, J. Fourier Anal. Appl. 3, 457-474
  • Fryzlewicz P., 2007, Unbalanced Haar Technique for Non Parametric Function Estimation, Journal of the American Statistical Association, 102, 1318-1327.
  • Timmermans C., von Sachs, R. , 2010, BAGIDIS, a new method for statistical analysis of differences between curves with sharp patterns (ISBA Discussion Paper 2010/30). Url : http://hdl.handle.net/2078.1/91090
  • Timmermans, C. , Fryzlewicz, P., 2012, SHAH: Shape-Adaptive Haar Wavelet Transform For Images With Application To Classification (ISBA Discussion Paper 2012/15). Url: http://hdl.handle.net/2078.1/110529

The function BUUHWE_2D in this package is similar to the function uh.bu.2d (copyrighted Fryzlewicz 2014) in the package "shah_code", available on the webpage of Piotr Fryzlewicz: http://stats.lse.ac.uk/fryzlewicz/shah/shah_code.R , which accompanies the paper Fryzlewicz and Timmermans (2015).

See Also

BUUHWE, semimetric.BAGIDIS_2D.

Examples

Run this code
x= 1:10
y=2:11
A=rbind(x,y)
semimetric.BAGIDIS(A)

B= rbind(x,x,y)
semimetric.BAGIDIS(A,B)

x= 1:30
y= 1:30
A= rbind(x,y)
B= rbind(x,x, y)
semimetric.BAGIDIS(A,B, wdw =15, Evol =TRUE, Overlap =0)

x= c(1,7,3,0,-2,6,4,0,2)
y= c(1,7,5,5,-2,1,4,0,2)
BAGIDIS.dist(BUUHWE(x), BUUHWE(y))

         

Run the code above in your browser using DataLab