Learn R Programming

LPStimeSeries (version 1.0-5)

tunelearnPattern: Tune Parameters of LPS for Time Series Classification

Description

tunelearnPattern implements parameter selection for LPS in time series classification problems. LPS requires the setting of segment length (if segment length is not random) and depth parameter. Given training time series and alternative parameter settings, the best set of parameters that minimizes the cross-validation error rate is returned.

Usage

tunelearnPattern(x, y, unlabeledx=NULL, nfolds=5, segmentlevels=c(0.25,0.5,0.75), random.split=0, mindepth=4, maxdepth=8, depthstep=2, ntreeTry=25, target.diff=TRUE, segment.diff=TRUE, ...)

Arguments

x
time series database as a matrix in UCR format. Rows are univariate time series, columns are observations (for the print method, a learnPattern object).
y
labels for the time series given by x
unlabeledx
unlabeled time series dataset. Introduced for future purposes as LPS can benefit from unlabeled data.
nfolds
number of cross-validation folds for parameter evaluation.
segmentlevels
alternative segment level settings to be evaluated. Settings are provided as an array.
random.split
Type of the split. If set to zero (0), splits are generated based on decrease in SSE in target segment Setting of one (1) generates the split value randomly between max and min values. Setting of two (2) generates a kd-tree type of split (i.e. median of the values at each node is chosen as the split).
mindepth
minimum depth level to be evaluated.
maxdepth
maximum depth level to be evaluated.
depthstep
step size to determine the depth levels between mindepth and maxdepth to be evaluated.
ntreeTry
number of trees to be train for each fold.
target.diff
Can target segment be a difference feature?
segment.diff
Can predictor segments be difference feature?
...
optional parameters to be passed to the low level function tunelearnPattern.

Value

A list with the following components:
params
evaluated parameter combinations as a matrix where rows are parameter combinations and columns represent the settings. First and seconds columns are the evaluated segment length level and depth respectively.
errors
cross-validation error rate for each parameter combinations
best.error
the minimum cross-validation error rate obtained.
best.seg
the segment length level that provides the minimum cross-validation error.
best.depth
the depth level that provides the minimum cross-validation error.
random.split
split type used for learning patterns.

References

Baydogan, M. G. (2013), ``Learned Pattern Similarity``, Homepage: http://www.mustafabaydogan.com/learned-pattern-similarity-lps.html. Breiman, L. (2001), Random Forests, Machine Learning 45(1), 5-32.

See Also

learnPattern, computeSimilarity

Examples

Run this code
data(GunPoint)
set.seed(71)

## Tune segment length level and depth on GunPoint training series
tuned=tunelearnPattern(GunPoint$trainseries,GunPoint$trainclass)
print(tuned$best.error)
print(tuned$best.seg)
print(tuned$best.depth)

## Use tuned parameters to learn patterns
ensemble=learnPattern(GunPoint$trainseries,segment.factor=tuned$best.seg,
					  maxdepth=tuned$best.depth)

## Find the similarity between test and training series based on the learned model
similarity=computeSimilarity(ensemble,GunPoint$testseries,GunPoint$trainseries)

## Find the index of 1 nearest neighbor (1NN) training series for each test series
NearestNeighbor=apply(similarity,1,which.min)

## Predicted class for each test series
predicted=GunPoint$trainclass[NearestNeighbor]

## Compute the percentage of accurate predictions
accuracy=sum(predicted==GunPoint$testclass)/nrow(GunPoint$testseries)
print(100*accuracy)

Run the code above in your browser using DataLab