proteus_random_search is a function for fine-tuning using random search on the hyper-parameter space of proteus (predefined or custom).
proteus_random_search(
n_samp,
data,
target,
future,
past = NULL,
ci = 0.8,
smoother = FALSE,
t_embed = NULL,
activ = NULL,
nodes = NULL,
distr = NULL,
optim = NULL,
loss_metric = "crps",
epochs = 30,
lr = NULL,
patience = 10,
latent_sample = 100,
verbose = TRUE,
stride = NULL,
dates = NULL,
rolling_blocks = FALSE,
n_blocks = 4,
block_minset = 10,
error_scale = "naive",
error_benchmark = "naive",
batch_size = 30,
min_default = 1,
seed = 42,
future_plan = "future::multisession",
omit = FALSE,
keep = FALSE
)
This function returns a list including:
random_search: summary of the sampled hyper-parameters and average error metrics.
best: best model according to overall ranking on all average error metrics (for negative metrics, absolute value is considered).
all_models: list with all generated models (if keep flagged to TRUE).
time_log: computation time.
Positive integer. Number of models to be randomly generated sampling the hyper-parameter space.
A data frame with time features on columns and possibly a date column (not mandatory).
Vector of strings. Names of the time features to be jointly analyzed.
Positive integer. The future dimension with number of time-steps to be predicted.
Positive integer. Length of past sequences. Default: NULL (search range future:2*future).
Positive numeric. Confidence interval. Default: 0.8.
Logical. Perform optimal smoothing using standard loess for each time feature. Default: FALSE.
Positive integer. Number of embedding for the temporal dimension. Minimum value is equal to 2. Default: NULL (search range 2:30).
String. Activation function to be used by the forward network. Implemented functions are: "linear", "mish", "swish", "leaky_relu", "celu", "elu", "gelu", "selu", "bent", "softmax", "softmin", "softsign", "softplus", "sigmoid", "tanh". Default: NULL (full-option search).
Positive integer. Nodes for the forward neural net. Default: NULL (search range 2:1024).
String. Distribution to be used by variational model. Implemented distributions are: "normal", "genbeta", "gev", "gpd", "genray", "cauchy", "exp", "logis", "chisq", "gumbel", "laplace", "lognorm", "skewed". Default: NULL (full-option search).
String. Optimization method. Implemented methods are: "adadelta", "adagrad", "rmsprop", "rprop", "sgd", "asgd", "adam". Default: NULL (full-option search).
String. Loss function for the variational model. Three options: "elbo", "crps", "score". Default: "crps".
Positive integer. Default: 30.
Positive numeric. Learning rate. Default: NULL (search range 0.001:0.1).
Positive integer. Waiting time (in epochs) before evaluating the overfit performance. Default: epochs.
Positive integer. Number of samples to draw from the latent variables. Default: 100.
Logical. Default: TRUE
Positive integer. Number of shifting positions for sequence generation. Default: NULL (search range 1:3).
String. Label of feature where dates are located. Default: NULL (progressive numbering).
Logical. Option for incremental or rolling window. Default: FALSE.
Positive integer. Number of distinct blocks for back-testing. Default: 4.
Positive integer. Minimum number of sequence to create a block. Default: 3.
String. Scale for the scaled error metrics (for continuous variables). Two options: "naive" (average of naive one-step absolute error for the historical series) or "deviation" (standard error of the historical series). Default: "naive".
String. Benchmark for the relative error metrics (for continuous variables). Two options: "naive" (sequential extension of last value) or "average" (mean value of true sequence). Default: "naive".
Positive integer. Default: 30.
Positive numeric. Minimum differentiation iteration. Default: 1.
Random seed. Default: 42.
how to resolve the future parallelization. Options are: "future::sequential", "future::multisession", "future::multicore". For more information, take a look at future specific documentation. Default: "future::multisession".
Logical. Flag to TRUE to remove missing values, otherwise all gaps, both in dates and values, will be filled with kalman filter. Default: FALSE.
Logical. Flag to TRUE to keep all the explored models. Default: FALSE.
Giancarlo Vercellino giancarlo.vercellino@gmail.com
https://rpubs.com/giancarlo_vercellino/proteus