simplex: Perform univariate forecasting

Description

simplex uses time delay embedding on a single time series to generate an attractor reconstruction, and then applies the simplex projection algorithm to make forecasts.

s_map is similar to simplex, but uses the S-map algorithm to make forecasts.

Usage

simplex(time_series, lib = NULL, pred = NULL, norm = 2, E = 1:10, 
    tau = -1, tp = 1, num_neighbors = "e+1", stats_only = TRUE, 
    exclusion_radius = NULL, epsilon = NULL, silent = TRUE)
s_map(time_series, lib = NULL, pred = NULL, norm = 2, E = 1, 
    tau = -1, tp = 1, num_neighbors = 0, theta = NULL, stats_only = TRUE, 
    exclusion_radius = NULL, epsilon = NULL, silent = TRUE,
    save_smap_coefficients = FALSE)

Arguments

time_series

either a vector to be used as the time series, or a data.frame or matrix with at least 2 columns (in which case the first column will be used as the time index, and the second column as the time series)

lib

a 2-column matrix, data.frame, 2-element vector or string of row indice pairs, where each pair specifies the first and last *rows* of the time series to create the library. If not specified, all available rows are used

pred

(same format as lib), but specifying the sections of the time series to forecast. If not specified, set equal to lib

norm

the distance measure to use. see 'Details'

the embedding dimensions to use for time delay embedding

tau

the time-delay offset to use for time delay embedding

the prediction horizon (how far ahead to forecast)

num_neighbors

the number of nearest neighbors to use. Note that the default value will change depending on the method selected. (any of "e+1", "E+1", "e + 1", "E + 1" will set this parameter to E+1.)

stats_only

specify whether to output just the forecast statistics or the raw predictions for each run

exclusion_radius

excludes vectors from the search space of nearest neighbors if their *time index* is within exclusion_radius (NULL turns this option off)

epsilon

Deprecated.

silent

prevents warning messages from being printed to the R console

theta

the nonlinear tuning parameter (theta is only relevant if method == "s-map")

save_smap_coefficients

specifies whether to include the s_map coefficients with the output

Value

For simplex, if stats_only = TRUE: a data.frame with components for the parameters and forecast statistics:

E	embedding dimension
tau	embedding time offset
tp	prediction horizon
nn	number of neighbors
num_pred	number of predictions
rho	correlation coefficient between observations and predictions
mae	mean absolute error
rmse	root mean square error
perc	percent correct sign
p_val	p-value that rho is significantly greater than 0 using Fisher's z-transformation
const_pred_rho	same as `rho`, but for the constant predictor
const_pred_mae	same as `mae`, but for the constant predictor
const_pred_rmse	same as `rmse`, but for the constant predictor
const_pred_perc	same as `perc`, but for the constant predictor
const_p_val	same as `p_val`, but for the constant predictor

For simplex, if stats_only = FALSE: a named list with data.frame "stats" specified above, and named list "model_output":

model_output

named list with data.frames for each model. Columns include the time index, observations, predictions, and estimated prediction variance

For s_map, if stats_only = TRUE, the same data.frame as for simplex, but with additional column:

theta

the nonlinear tuning parameter

For s_map, if save_smap_coefficients = TRUE, a named list with data.frame "stats" specified above and the following list items:

smap_coefficients	data.frame with columns for the s-map coefficients
smap_coefficient_covariances	list of covariance matrices for the s-map coefficients

For s_map, if stats_only = FALSE, a named list with data.frame "stats" specified above, and named list "model_output":

model_output

named list with data.frames for each model. Columns include the time index, observations, predictions, and estimated prediction variance

Details

simplex is typically applied, and the embedding dimension varied, to find an optimal embedding dimension for the data. Thus, the default parameters are set so that passing a time series as the only argument will run over E = 1:10 (embedding dimension), using leave-one-out cross-validation over the whole time series, and returning just the forecast statistics.

s_map is typically applied, with fixed embedding dimension, and theta varied, to test for nonlinear dynamics in the data. Thus, the default parameters are set so that passing a time series as the only argument will run over a default list of thetas (0, 0.0001, 0.0003, 0.001, 0.003, 0.01, 0.03, 0.1, 0.3, 0.5, 0.75, 1.0, 1.5, 2, 3, 4, 6, and 8), using E = 1, leave-one-out cross-validation over the whole time series, and returning just the forecast statistics.

norm = 2 (only option currently available) uses the "L2 norm", Euclidean distance: $$distance(a,b) := \sqrt{\sum_i{(a_i - b_i)^2}} $$

Examples

Run this code

# NOT RUN {
ts <- block_3sp$x_t
simplex(ts, lib = c(1, 100), pred = c(101, 190))

ts <- block_3sp$x_t
simplex(ts, stats_only = FALSE)
 
ts <- block_3sp$x_t
s_map(ts, E = 2)

ts <- block_3sp$x_t
s_map(ts, E = 2, theta = 1, save_smap_coefficients = TRUE)
# }

Run the code above in your browser using DataLab