Learn R Programming

rEDM (version 0.4.7)

simplex: Perform univariate forecasting

Description

simplex uses time delay embedding on a single time series to generate an attractor reconstruction, and then applies the simplex projection algorithm to make forecasts.

s_map is similar to simplex, but uses the S-map algorithm to make forecasts.

Usage

simplex(time_series, lib = c(1, NROW(time_series)), pred = lib, norm_type = c("L2 norm", "L1 norm", "P norm"), P = 0.5, E = 1:10, tau = 1, tp = 1, num_neighbors = "e+1", stats_only = TRUE, exclusion_radius = NULL, epsilon = NULL, silent = FALSE)
s_map(time_series, lib = c(1, NROW(time_series)), pred = lib, norm_type = c("L2 norm", "L1 norm", "P norm"), P = 0.5, E = 1, tau = 1, tp = 1, num_neighbors = 0, theta = c(0, 1e-04, 3e-04, 0.001, 0.003, 0.01, 0.03, 0.1, 0.3, 0.5, 0.75, 1, 1.5, 2, 3, 4, 6, 8), stats_only = TRUE, exclusion_radius = NULL, epsilon = NULL, silent = FALSE, save_smap_coefficients = FALSE)

Arguments

time_series
either a vector to be used as the time series, or a data.frame or matrix with at least 2 columns (in which case the first column will be used as the time index, and the second column as the time series)
lib
a 2-column matrix (or 2-element vector) where each row specifes the first and last *rows* of the time series to use for attractor reconstruction
pred
(same format as lib), but specifying the sections of the time series to forecast.
norm_type
the distance function to use. see 'Details'
P
the exponent for the P norm
E
the embedding dimensions to use for time delay embedding
tau
the lag to use for time delay embedding
tp
the prediction horizon (how far ahead to forecast)
num_neighbors
the number of nearest neighbors to use (any of "e+1", "E+1", "e + 1", "E + 1" will peg this parameter to E+1 for each run, any value < 1 will use all possible neighbors.)
stats_only
specify whether to output just the forecast statistics or the raw predictions for each run
exclusion_radius
excludes vectors from the search space of nearest neighbors if their *time index* is within exclusion_radius (NULL turns this option off)
epsilon
excludes vectors from the search space of nearest neighbors if their *distance* is farther away than epsilon (NULL turns this option off)
silent
prevents warning messages from being printed to the R console
theta
the nonlinear tuning parameter (note that theta = 0 is equivalent to an autoregressive model of order E.)
save_smap_coefficients
specifies whether to include the s_map coefficients with the output (and forces the full output as if stats_only were set to FALSE)

Value

For simplex, if stats_only = TRUE, then a data.frame with components for the parameters and forecast statistics:
E
embedding dimension
tau
time lag
tp
prediction horizon
nn
number of neighbors
num_pred
number of predictions
rho
correlation coefficient between observations and predictions
mae
mean absolute error
rmse
root mean square error
perc
percent correct sign
p_val
p-value that rho is significantly greater than 0 using Fisher's z-transformation
const_rho
same as rho, but for the constant predictor
const_mae
same as mae, but for the constant predictor
const_rmse
same as rmse, but for the constant predictor
const_perc
same as perc, but for the constant predictor
Otherwise, a list where the number of elements is equal to the number of runs (unique parameter combinations). Each element is a list with the following components:
params
data.frame of parameters (E, tau, tp, nn)
model_output
data.frame with columns for the time index, observations, and predictions
stats
data.frame of forecast statistics (num_pred, rho, mae, rmse)
For s_map, the same as for simplex, but with an additional column for the value of theta. If stats_only = FALSE and save_smap_coefficients = TRUE, then a matrix of S-map coefficients will appear in the full output.

Details

simplex is typically applied, and the embedding dimension varied, to find an optimal embedding dimension for the data. Thus, the default parameters are set so that passing a time series as the only argument will run over E = 1:10 (embedding dimension), using leave-one-out cross-validation over the whole time series, and returning just the forecast statistics.

s_map is typically applied, with fixed embedding dimension, and theta varied, to test for nonlinear dynamics in the data. Thus, the default parameters are set so that passing a time series as the only argument will run over a default list of thetas (0, 0.0001, 0.0003, 0.001, 0.003, 0.01, 0.03, 0.1, 0.3, 0.5, 0.75, 1.0, 1.5, 2, 3, 4, 6, and 8), using E = 1, leave-one-out cross-validation over the whole time series, and returning just the forecast statistics.

norm_type "L2 norm" (default) uses the typical Euclidean distance: $$distance(a,b) := \sqrt{\sum_i{(a_i - b_i)^2}}$$ norm_type "L1 norm" uses the Manhattan distance: $$distance(a,b) := \sum_i{|a_i - b_i|}$$ norm type "P norm" uses the LP norm, generalizing the L1 and L2 norm to use $p$ as the exponent: $$distance(a,b) := \sum_i{(a_i - b_i)^p}^{1/p}$$

Examples

Run this code
data("two_species_model")
ts <- two_species_model$x[1:200]
simplex(ts, lib = c(1, 100), pred = c(101, 200))

data("two_species_model")
ts <- two_species_model$x[1:200]
#' simplex(ts, stats_only = FALSE)
data("two_species_model")
ts <- two_species_model$x[1:200]
s_map(ts, E = 2)

data("two_species_model")
ts <- two_species_model$x[1:200]
s_map(ts, E = 2, theta = 1, save_smap_coefficients = TRUE)

Run the code above in your browser using DataLab