tunelearnPattern: Tune Parameters of LPS for Time Series Classification

Description

tunelearnPattern implements parameter selection for LPS in time series classification problems. LPS requires the setting of segment length (if segment length is not random) and depth parameter. Given training time series and alternative parameter settings, the best set of parameters that minimizes the cross-validation error rate is returned.

Usage

tunelearnPattern(x, y, unlabeledx=NULL, nfolds=5,  segmentlevels=c(0.25,0.5,0.75), random.split=0, mindepth=4, maxdepth=8, depthstep=2,  ntreeTry=25, target.diff=TRUE, segment.diff=TRUE, ...)

Arguments

time series database as a matrix in UCR format. Rows are univariate time series, columns are observations (for the print method, a learnPattern object).

labels for the time series given by x

unlabeledx

unlabeled time series dataset. Introduced for future purposes as LPS can benefit from unlabeled data.

nfolds

number of cross-validation folds for parameter evaluation.

segmentlevels

alternative segment level settings to be evaluated. Settings are provided as an array.

random.split

Type of the split. If set to zero (0), splits are generated based on decrease in SSE in target segment Setting of one (1) generates the split value randomly between max and min values. Setting of two (2) generates a kd-tree type of split (i.e. median of the values at each node is chosen as the split).

mindepth

minimum depth level to be evaluated.

maxdepth

maximum depth level to be evaluated.

depthstep

step size to determine the depth levels between mindepth and maxdepth to be evaluated.

ntreeTry

number of trees to be train for each fold.

target.diff

Can target segment be a difference feature?

segment.diff

Can predictor segments be difference feature?

...

optional parameters to be passed to the low level function tunelearnPattern.

Value

params: evaluated parameter combinations as a matrix where rows are parameter combinations and columns represent the settings. First and seconds columns are the evaluated segment length level and depth respectively.
errors: cross-validation error rate for each parameter combinations
best.error: the minimum cross-validation error rate obtained.
best.seg: the segment length level that provides the minimum cross-validation error.
best.depth: the depth level that provides the minimum cross-validation error.
random.split: split type used for learning patterns.

References

Baydogan, M. G. (2013), ``Learned Pattern Similarity``, Homepage: http://www.mustafabaydogan.com/learned-pattern-similarity-lps.html. Breiman, L. (2001), Random Forests, Machine Learning 45(1), 5-32.

Examples

Run this code

data(GunPoint)
set.seed(71)

## Tune segment length level and depth on GunPoint training series
tuned=tunelearnPattern(GunPoint$trainseries,GunPoint$trainclass)
print(tuned$best.error)
print(tuned$best.seg)
print(tuned$best.depth)

## Use tuned parameters to learn patterns
ensemble=learnPattern(GunPoint$trainseries,segment.factor=tuned$best.seg,
					  maxdepth=tuned$best.depth)

## Find the similarity between test and training series based on the learned model
similarity=computeSimilarity(ensemble,GunPoint$testseries,GunPoint$trainseries)

## Find the index of 1 nearest neighbor (1NN) training series for each test series
NearestNeighbor=apply(similarity,1,which.min)

## Predicted class for each test series
predicted=GunPoint$trainclass[NearestNeighbor]

## Compute the percentage of accurate predictions
accuracy=sum(predicted==GunPoint$testclass)/nrow(GunPoint$testseries)
print(100*accuracy)