Learn R Programming

ctsfeatures (version 1.2.2)

calculate_subfeatures: Computes several subfeatures associated with a categorical time series

Description

calculate_features computes several subfeatures associated with a categorical time series or between a categorical and a real-valued time series

Usage

calculate_subfeatures(series, n_series, lag = 1, type = NULL)

Value

The corresponding subfeature

Arguments

series

An object of type tsibble (see R package tsibble), whose column named Value contains the values of the corresponding CTS. This column must be of class factor and its levels must be determined by the range of the CTS.

n_series

A real-valued time series.

lag

The considered lag (default is 1).

type

String indicating the subfeature one wishes to compute.

Author

Ángel López-Oriona, José A. Vilar

Details

Assume we have a CTS of length \(T\) with range \(\mathcal{V}=\{1, 2, \ldots, r\}\), \(\overline{X}_t=\{\overline{X}_1,\ldots, \overline{X}_T\}\), with \(\widehat{p}_i\) being the natural estimate of the marginal probability of the \(i\)th category, and \(\widehat{p}_{ij}(l)\) being the natural estimate of the joint probability for categories \(i\) and \(j\) at lag l, \(i,j=1, \ldots, r\). Assume also that we have a real-valued time series of length \(T\), \(\overline{Z}_t=\{\overline{Z}_1,\ldots, \overline{Z}_T\}\). The function computes the following subfeatures depending on the argument type:

  • If type=entropy, the function computes the subfeatures associated with the estimated entropy, \(\widehat{p}_i\ln(\widehat{p}_i)\), \(i=1,2, \ldots,r\).

  • If type=gk_tau, the function computes the subfeatures associated with the estimated Goodman and Kruskal's tau, \(\frac{\widehat{p}_{ij}(l)^2}{\widehat{p}_j}\), \(i,j=1,2, \ldots,r\).

  • If type=gk_lambda, the function computes the subfeatures associated with the estimated Goodman and Kruskal's lambda, \(\max_i\widehat{p}_{ij}(l)\), \(i=1,2, \ldots,r\).

  • If type=uncertainty_coefficient, the function computes the subfeatures associated with the estimated uncertainty coefficient, \(\widehat{p}_{ij}(l)\ln\Big(\frac{\widehat{p}_{ij}(l)}{\widehat{p}_i\widehat{p}_j}\Big)\), \(i,j=1,2, \ldots,r\).

  • If type=pearson_measure, the function computes the subfeatures associated with the estimated Pearson measure, \(\frac{(\widehat{p}_{ij}(l)-\widehat{p}_i\widehat{p}_j)^2}{\widehat{p}_i\widehat{p}_j}\), \(i,j=1,2, \ldots,r\).

  • If type=phi2_measure, the function computes the subfeatures associated with the estimated Phi2 measure, \(\frac{(\widehat{p}_{ij}(l)-\widehat{p}_i\widehat{p}_j)^2}{\widehat{p}_i\widehat{p}_j}\), \(i,j=1,2, \ldots,r\).

  • If type=sakoda_measure, the function computes the subfeatures associated with the estimated Sakoda measure, \(\frac{(\widehat{p}_{ij}(l)-\widehat{p}_i\widehat{p}_j)^2}{\widehat{p}_i\widehat{p}_j}\), \(i,j=1,2, \ldots,r\).

  • If type=cramers_vi, the function computes the subfeatures associated with the estimated Cramer's vi, \(\frac{(\widehat{p}_{ij}(l)-\widehat{p}_i\widehat{p}_j)^2}{\widehat{p}_i\widehat{p}_j}\), \(i,j=1,2, \ldots,r\).

  • If type=cohens_kappa, the function computes the subfeatures associated with the estimated Cohen's kappa, \(\widehat{p}_{ii}(l)-\widehat{p}_i^2\), \(i=1,2, \ldots,r\).

  • If type=total_correlation, the function computes the subfeatures associated with the total correlation, \(\widehat{\psi}_{ij}(l)\), \(i,j=1,2, \ldots,r\) (see type='total_mixed_cor' in the function calculate_features).

  • If type=total_mixed_correlation_1, the function computes the subfeatures associated with the total mixed l-correlation, \(\widehat{\psi}_{i}(l)\), \(i=1,2, \ldots,r\) (see type='total_mixed_correlation_1' in the function calculate_features).

  • If type=total_mixed_correlation_2, the function computes the subfeatures associated with the total mixed q-correlation, \(\int_{0}^{1}\widehat{\psi}^\rho_{i}(l)^2d\rho\), \(i=1,2, \ldots,r\) (see type='total_mixed_correlation_2' in the function calculate_features).

References

weiss2008measuringctsfeatures

Examples

Run this code
sequence_1 <- GeneticSequences[which(GeneticSequences$Series==1),]
suc <- calculate_subfeatures(series = sequence_1, type = 'uncertainty_coefficient')
# Computing the subfeatures associated with the uncertainty coefficient
# for the first series in dataset GeneticSequences
scv <- calculate_subfeatures(series = sequence_1, type = 'cramers_vi' )
# Computing the subfeatures associated with the cramers vi
# for the first series in dataset GeneticSequences

Run the code above in your browser using DataLab