grid_max_entropy: Space-filling parameter grids

Description

Experimental designs for computer experiments are used to construct parameter grids that try to cover the parameter space such that any portion of the space has an observed combination that is not too far from it.

Usage

grid_max_entropy(
  x,
  ...,
  size = 3,
  original = TRUE,
  variogram_range = 0.5,
  iter = 1000
)
# S3 method for parameters
grid_max_entropy(
  x,
  ...,
  size = 3,
  original = TRUE,
  variogram_range = 0.5,
  iter = 1000
)
# S3 method for list
grid_max_entropy(
  x,
  ...,
  size = 3,
  original = TRUE,
  variogram_range = 0.5,
  iter = 1000
)
# S3 method for param
grid_max_entropy(
  x,
  ...,
  size = 3,
  original = TRUE,
  variogram_range = 0.5,
  iter = 1000
)
# S3 method for workflow
grid_max_entropy(
  x,
  ...,
  size = 3,
  original = TRUE,
  variogram_range = 0.5,
  iter = 1000
)
grid_latin_hypercube(x, ..., size = 3, original = TRUE)
# S3 method for parameters
grid_latin_hypercube(x, ..., size = 3, original = TRUE)
# S3 method for list
grid_latin_hypercube(x, ..., size = 3, original = TRUE)
# S3 method for param
grid_latin_hypercube(x, ..., size = 3, original = TRUE)
# S3 method for workflow
grid_latin_hypercube(x, ..., size = 3, original = TRUE)

Arguments

A param object, list, or parameters.

...

One or more param objects (such as mtry() or penalty()). None of the objects can have unknown() values in the parameter ranges or values.

size

A single integer for the total number of parameter value combinations returned. If duplicate combinations are generated from this size, the smaller, unique set is returned.

original

A logical: should the parameters be in the original units or in the transformed space (if any)?

variogram_range

A numeric value greater than zero. Larger values reduce the likelihood of empty regions in the parameter space.

iter

An integer for the maximum number of iterations used to find a good design.

Details

The types of designs supported here are latin hypercube designs and designs that attempt to maximize the determinant of the spatial correlation matrix between coordinates. Both designs use random sampling of points in the parameter space.

Note that there may a difference in grids depending on how the function is called. If the call uses the parameter objects directly the possible ranges come from the objects in dials. For example:

cost()

## Cost  (quantitative)
## Transformer:  log-2 
## Range (transformed scale): [-10, -1]

set.seed(283)
cost_grid_1 <- grid_latin_hypercube(cost(), size = 1000)
range(log2(cost_grid_1$cost))

## [1] -9.998623 -1.000423

However, in some cases, the tune package overrides the default ranges for specific models. If the grid function uses a parameters object created from a model or recipe, the ranges my have different defaults (specific to those models). Using the example above, the cost argument above is different for SVM models:

library(parsnip)
library(tune)
# When used in tune, the log2 range is [-10, 5]
svm_mod <-
  svm_rbf(cost = tune()) %>%
  set_engine("kernlab")
set.seed(283)
cost_grid_2 <- grid_latin_hypercube(parameters(svm_mod), size = 1000)
range(log2(cost_grid_2$cost))

## [1] -9.997704  4.999296

References

Sacks, Jerome & Welch, William & J. Mitchell, Toby, and Wynn, Henry. (1989). Design and analysis of computer experiments. With comments and a rejoinder by the authors. Statistical Science. 4. 10.1214/ss/1177012413.

Santner, Thomas, Williams, Brian, and Notz, William. (2003). The Design and Analysis of Computer Experiments. Springer.

Dupuy, D., Helbert, C., and Franco, J. (2015). DiceDesign and DiceEval: Two R packages for design and analysis of computer experiments. Journal of Statistical Software, 65(11)

Examples

Run this code

# NOT RUN {
grid_max_entropy(
  hidden_units(),
  penalty(),
  epochs(),
  activation(),
  learn_rate(c(0, 1), trans = scales::log_trans()),
  size = 10,
  original = FALSE)

grid_latin_hypercube(penalty(), mixture(), original = TRUE)
# }

Run the code above in your browser using DataLab