kma.compare: kma.compare runs kma with different numbers of clusters and different warping methods.

Description

In kma.compare the user can specify multiple values for n.clust and warping.method. kma.compare runs the K-Mean Alignment algorithm (kma function) for all couples of specified values of n.clust and warping.method.

Usage

kma.compare(x, y0 = NULL, y1 = NULL, n.clust = c(1, 2),
warping.method = c("NOalignment", "shift", "dilation", "affine"),
similarity.method = "d1.pearson", center.method = "k-means", seeds = NULL,
optim.method = "L-BFGS-B", span = 0.15, t.max = 0.1, m.max = 0.1, n.out = NULL,
tol = 0.01, fence = TRUE, iter.max = 100, show.iter = 0, plot.graph = 0, 
nstart = 2, return.all = FALSE)

Arguments

matrix n.func X grid.size or vector grid.size: the abscissa values where each function is evaluated. n.func: number of functions in the dataset. grid.size: maximal number of abscissa values where each function is evaluated. The abscissa points may be unevenly spaced and they may differ from function to function. x can also be a vector of length grid.size. In this case, x will be used as abscissa grid for all functions.

matrix n.func X grid.size or array n.func X grid.size X d: evaluations of the set of original functions on the abscissa grid x. n.func: number of functions in the dataset. grid.size: maximal number of abscissa values where each function is evaluated. d: (only if the sample is multidimensional) number of function components, i.e. each function is a d-dimensional curve. Default value of y0 is NULL. The parameter y0 must be provided if the chosen similarity.method concerns original functions.

matrix n.func X grid.size or array n.func X grid.size X d: evaluations of the set of original functions first derivatives on the abscissa grid x. Default value of y1 is NULL. The parameter y1 must be provided if the chosen similarity.method concerns original function first derivatives.

n.clust

vector: n.clust contains the numbers of clusters with which kma.compare runs kma function. Default value is c(1,2). See details.

warping.method

vector: warping.method contains the types of alignment with which kma.compare runs kma function. See details.

similarity.method

character: required similarity measure. Possible choices are: 'd0.pearson', 'd1.pearson', 'd0.L2', 'd1.L2', 'd0.L2.centered', 'd1.L2.centered'. Default value is 'd1.pearson'. See kma.similarity for details.

center.method

character: type of clustering method to be used. Possible choices are: 'k-means' and 'k-medoids'. Default value is 'k-means'.

seeds

vector max(n.clust) or matrix nstart X n.clust: indexes of the functions to be used as initial centers. If it is a matrix, each row contains the indexes of the initial centers of one of the nstart initializations; if not all the values of seeds are provided, the ones not introduced are randomly chosen among the n.func original functions. If seeds=NULL all the centers are randomly chosen. Default value of seeds is NULL.

optim.method

character: optimization method chosen to find the best warping functions at each iteration. Possible choices are: 'L-BFGS-B' and 'SANN'. See optim function for details. Default method is 'L-BFGS-B'.

span

scalar: the span to be used for the loess procedure in the center estimation step when center.method='k-means'. Default value is 0.15. If center.method='k-medoids' value of span is ignored.

t.max

scalar: t.max controls the maximal allowed shift, at each iteration, in the alignment procedure with respect to the range of curve domains. t.max must be such that 0 (e.g., t.max=0.1 means that shift is bounded, at each iteration, between -0.1*range(x) and +0.1*range(x)). Default value is 0.1. If warping.method='dilation' value of t.max is ignored.

m.max

scalar: m.max controls the maximal allowed dilation, at each iteration, in the alignment procedure. m.max must be such that 0 (e.g., m.max=0.1 means that dilation is bounded, at each iteration, between 1-0.1 and 1+0.1 ). Default value is 0.1. If warping.method='shift' value of m.max is ignored.

n.out

scalar: the desired length of the abscissa for computation of the similarity indexes and the centers. Default value is round(1.1*grid.size).

tol

scalar: the algorithm stops when the increment of similarity of each function with respect to the corrispondent center is lower than tol. Default value is 0.01.

fence

boolean: if fence=TRUE a control is activated at the end of each iteration. The aim of the control is to avoid shift/dilation outlighers with respect to their computed distributions. If fence=TRUE the running time can increase considerably. Default value of fence is TRUE.

iter.max

scalar: maximum number of iterations in the k-mean alignment cycle. Default value is 100.

show.iter

boolean: if show.iter=TRUE kma shows the current iteration of the algorithm. Default value is FALSE.

plot.graph

boolean: if plot.graph=TRUE, kma.compare plots a graphic with the means of similarity indexes as ordinate and the number of clusters as abscissa. Default value is FALSE.

nstart

scalar: number of initializations with different seeds. Default value is 1.

return.all

boolean: if return.all=TRUE the results of all the nstart initializations are return; the output is a list of length nstart. If return.all=FALSE only the best result is provided (the one with higher mean similarity if similarity.method is 'd0.pearson' or'd1.pearson', or the one with lower similarity if similarity.method is 'd0.L2', 'd1.L2', 'd0.L2.centered' or 'd1.L2.centered'), Default value is FALSE.

Value

Result.NOalignment: list of outputs of kma function with warping.type='NOalignment'. The sublist Result.NOalignment[[k]] corresponds to the results when number of clusters is n.clust[k]. Note that if 'NOalignment' is not chosen as warping.type, then Result.NOalignment will be NULL.
Result.shift: list of outputs of kma function with warping.type='shift'. The sublist Result.shift[[k]] corresponds to the results when number of clusters is n.clust[k]. Note that if 'shift' is not chosen as warping.type, then Result.shift will be NULL.
Result.dilation: list of outputs of kma function with warping.type='dilation'. The sublist Result.dilation[[k]] corresponds to the results when number of clusters is n.clust[k]. Note that if 'dilation' is not chosen as warping.type, then Result.dilation will be NULL.
Result.affine: list of outputs of kma function with warping.type='affine'. The sublist Result.affine[[k]] corresponds to the results when number of clusters is n.clust[k]. Note that if 'affine' is not chosen as warping.type, then Result.affine will be NULL.
n.clust: as input.
mean.similarity.NOalignment: vector: mean similarity indexes of functions after running kma function with all elements of n.clust and warping.type='NOalignment'. mean.similarity.NOalignment contains the ordinates of the black curve ("without alignment" in the legend) of the output graphic of the kma.compare function (if plot.graph=1).
mean.similarity.shift: vector: mean similarity indexes of curves after running kma function with all elements of n.clust and warping.type='shift'. mean.similarity.shift contains the ordinates of the blue curve ("shift" in the legend) of the output graphic of the kma.compare function (if plot.graph=1).
mean.similarity.dilation: vector: mean similarity indexes of curves after running kma function with all elements of n.clust and warping.type='dilation'. mean.similarity.dilation contains the ordinates of the green curve ("dilation" in the legend) of the output graphic of the kma.compare function (if plot.graph=1).
mean.similarity.affine: vector: mean similarity indexes of curves after running kma function with all elements of n.clust and warping.type='affine'. mean.similarity.affine contains the ordinates of the orange curve ("affine" in the legend) of the output graphic of the kma.compare function (if plot.graph=1).

Details

Example of use: if n.clust=c(1,2,3) and warping.method=c('shift','affine'), kma.compare runs kma function with number of clusters equal to 1, 2 and 3 using warping.method='shift' and warping.method='affine'.

References

Sangalli, L.M., Secchi, P., Vantini, S., Vitelli, V., 2010. "K-mean alignment for curve clustering". Computational Statistics and Data Analysis, 54, 1219-1233.

Sangalli, L.M., Secchi, P., Vantini, S., 2014. "Analysis of AneuRisk65 data: K-mean Alignment". Electronic Journal of Statistics, Special Section on "Statistics of Time Warpings and Phase Variations", Vol. 8, No. 2, 1891-1904.

Examples

Run this code

data(kma.data)

x <- kma.data$x # abscissas
y0 <- kma.data$y0 # evaluations of original functions
y1 <- kma.data$y1 # evaluations of original function first derivatives

## Not run: 
# # Plot of original functions
# matplot(t(x),t(y0), type='l', xlab='x', ylab='orig.func')
# title ('Original functions')
# 
# # Plot of original function first derivatives
# matplot(t(x),t(y1), type='l', xlab='x', ylab='orig.deriv')
# title ('Original function first derivatives')
# 
# 
# # Example: results of kma function with 3 different 
# # numbers of clusters (1,2,3) combined with four alignment
# # methods ('NOalignment' by default, 'shift', 'dilation',
# # 'affine') and considering 'd1.pearson' as similarity.method.
# kma.compare_example <- kma.compare (
#   x=x, y0=y0, y1=y1, n.clust = 1:3, 
#   warping.method = c('affine'), 
#   similarity.method = 'd1.pearson',
#   center.method = 'k-means', 
#   seeds = c(1,21,30),
#   plot.graph=1)
# 
# names (kma.compare_example)
# 
# # To see results for kma function with n.clust=2 
# # and warping.method='affine'.
# kma.show.results (kma.compare_example$Result.affine[[2]])
# 
# 
# # Labels assigned to each function for the 
# # kma function with n.clust=2 and warping.method='affine'.
# kma.compare_example$Result.affine[[2]]$labels
# ## End(Not run)

Run the code above in your browser using DataLab