generateDesign: Generates a statistical design for a parameter set.

Description

The following types of columns are created:

numeric(vector)	`numeric`
integer(vector)	`integer`
discrete(vector)	`factor` (names of values = levels)

If you want to convert these, look at convertDataFrameCols. Dependent parameters whose constraints are unsatisfied generate NA entries in their respective columns. For discrete vectors the levels and their order will be preserved, even if not all levels are present.

Currently only lhs designs are supported.

The algorithm currently iterates the following steps:

We create a space filling design for all parameters, disregarding requires, a trafo or the forbidden region.
Forbidden points are removed.
Parameters are trafoed (potentially, depending on the setting of argument trafo); dependent parameters whose constraints are unsatisfied are set to NA entries.
Duplicated design points are removed. Duplicated points are not generated in a reasonable space-filling design, but the way discrete parameters and also parameter dependencies are handled make this possible.
If we removed some points, we now try to augment the design in a space-filling way and iterate.

Note that augmenting currently is somewhat experimental as we simply generate missing points via new calls to randomLHS, but do not add points so they are maximally far away from the already present ones. The reason is that the latter is quite hard to achieve with complicated dependencies and forbidden regions, if one wants to ensure that points actually get added... But we are working on it.

Note that if you have trafos attached to your params, the complete creation of the design (except for the detection of invalid parameters w.r.t to their requires setting) takes place on the UNTRANSFORMED scale. So this function creates, e.g., a maximin LHS design on the UNTRANSFORMED scale, but not necessarily the transformed scale.

generateDesign will NOT work if there are dependencies over multiple levels of parameters and the dependency is only given with respect to the “previous” parameter. A current workaround is to state all dependencies on all parameters involved. (We are working on it.)

Usage

generateDesign(n = 10L, par.set, fun, fun.args = list(), trafo = FALSE,
  augment = 20L)

Arguments

[integer(1)] Number of samples in design. Default is 10.

par.set

[ParamSet] Parameter set.

fun

[function] Function from package lhs. Possible are: maximinLHS, randomLHS, geneticLHS, improvedLHS, optAugmentLHS, optimumLHS Default is randomLHS.

fun.args

[list] List of further arguments passed to fun.

trafo

[logical(1)] Transform all parameters by using theirs respective transformation functions. Default is FALSE.

augment

[integer(1)] Duplicated values and forbidden regions in the parameter space can lead to the design becoming smaller than n. With this option it is possible to augment the design again to size n. It is not guaranteed that this always works (to full size) and augment specifies the number of tries to augment. If the the design is of size less than n after all tries, a warning is issued and the smaller design is returned. Default is 20.

Value

[data.frame]. Columns are named by the ids of the parameters. If the par.set argument contains a vector parameter, its corresponding column names in the design are the parameter id concatenated with 1 to dimension of the vector. The result will have an logical(1) attribute “trafo”, which is set to the value of argument trafo.

Examples

Run this code

# NOT RUN {
ps = makeParamSet(
  makeNumericParam("x1", lower = -2, upper = 1),
  makeIntegerParam("x2", lower = 10, upper = 20)
)
# random latin hypercube design with 5 samples:
generateDesign(5, ps)

# with trafo
ps = makeParamSet(
  makeNumericParam("x", lower = -2, upper = 1),
  makeNumericVectorParam("y", len = 2, lower = 0, upper = 1, trafo = function(x) x/sum(x))
)
generateDesign(10, ps, trafo = TRUE)
# }

Run the code above in your browser using DataLab