Learn R Programming

tsmp (version 0.3.1)

salient_subsequences: Retrieve salient subsequences from a dataset

Description

In order to allow a meaningful visualization in Multi-Dimensional Space (MDS), this function retrieves the most relevant subsequences using Minimal Description Length (MDL) framework.

Usage

salient_subsequences(.mp, data, n_bits = 8, n_cand = 10,
  exclusion_zone = NULL, verbose = 2)

Arguments

.mp

a TSMP object of class MatrixProfile.

data

the data used to build the Matrix Profile, if not embedded.

n_bits

an int. Number of bits for MDL discretization. (Default is 8).

n_cand

an int. number of candidate when picking the subsequence in each iteration. (Default is 10).

exclusion_zone

if a number will be used instead of embedded value. (Default is NULL).

verbose

an int. See details. (Default is 2).

Value

Returns the input .mp object with a new name salient. It contains: indexes, a vector with the starting position of each subsequence, idx_bit_size, a vector with the associated bitsize for each iteration and bits the value used as input on n_bits.

Details

verbose changes how much information is printed by this function; 0 means nothing, 1 means text, 2 means text and sound.

References

  • Yeh CCM, Van Herle H, Keogh E. Matrix profile III: The matrix profile allows visualization of salient subsequences in massive time series. Proc - IEEE Int Conf Data Mining, ICDM. 2017;579<U+2013>88.

  • Hu B, Rakthanmanon T, Hao Y, Evans S, Lonardi S, Keogh E. Discovering the Intrinsic Cardinality and Dimensionality of Time Series Using MDL. In: 2011 IEEE 11th International Conference on Data Mining. IEEE; 2011. p. 1086<U+2013>91.

Website: https://sites.google.com/site/salientsubs/

Examples

Run this code
# NOT RUN {
data <- mp_toy_data$data[, 1]
mp <- tsmp(data, window_size = 30, verbose = 0)
mps <- salient_subsequences(mp, data, verbose = 0)

# }

Run the code above in your browser using DataLab