Executes an independent stochastic hill-climbing search (random descent) per objective to approximate the optimal solution for each objective, from which a suitable normalization range is inferred based on the Pareto minima/maxima. These normalization searches are executed in parallel.
getNormalizationRanges(
data,
obj,
size = 0.2,
always.selected = integer(0),
never.selected = integer(0),
mode = c("default", "fast"),
time = NA,
impr.time = NA,
steps = NA,
impr.steps = NA
)
Numeric matrix with one row per objective and two columns:
lower
Lower bound of normalization range.
upper
Upper bound of normalization range.
Core Hunter data (chdata
) containing genotypes,
phenotypes and/or a precomputed distance matrix. Can also be an
object of class chdist
, chgeno
or chpheno
if only one type of data is provided.
List of objectives (chobj
).
If no objectives are specified Core Hunter maximizes a weighted
index including the default entry-to-nearest-entry distance
(EN
) for each available data type.
For genotypes, the Modified Roger's distance (MR
) is
used. For phenotypes, Gower's distance (GD
) is applied.
Desired core subset size (numeric). If larger than one the value is used as the absolute core size after rounding. Else it is used as the sampling rate and multiplied with the dataset size to determine the size of the core. The default sampling rate is 0.2.
vector with indices (integer) or ids (character) of items that should always be selected in the core collection
vector with indices (integer) or ids (character) of items that should never be selected in the core collection
Execution mode (default
or fast
). In default mode,
the normalization searches terminate when no improvement is found for ten
seconds. In fast mode, searches terminate as soon as no improvement is
made for two seconds. These stop conditions can be overridden using arguments
time
, impr.time
, steps
and/or impr.steps
. In
default
mode, the value of the latter two, step-based conditions is
multiplied with 500, in line with the behaviour of sampleCore
when executed in default
mode.
Absolute runtime limit in seconds. Not used by default (NA
).
If used, it should be a strictly positive value, which is rounded to the
nearest integer.
Maximum time without improvement in seconds. If no explicit
stop conditions are specified, the maximum time without improvement defaults
to ten or two seconds, when executing Core Hunter in default
or
fast
mode, respectively. If a custom improvement time is specified,
it should be strictly positive and is rounded to the nearest integer.
Maximum number of search steps. Not used by default (NA
).
If used, it should be a strictly positive value, which is rounded
to the nearest integer. In default
mode, the value is
multiplied with 500, in line with the behaviour of
sampleCore
when executed in default
mode.
Maximum number of steps without improvement. Not used by
default (NA
). If used, it should be a strictly
positive value, which is rounded to the nearest integer.
In default
mode, the value is multiplied with 500,
in line with the behaviour of sampleCore
when executed in default
mode.
For an objective that is being maximized, the upper bound is set to the value of the best solution for that objective, while the lower bound is set to the Pareto minimum, i.e. the minimum value obtained when evaluating all optimal solutions (for each single objective) with the considered objective. For an objective that is being minimized, the roles of upper and lower bound are interchanged, and the Pareto maximum is used instead.
Because Core Hunter uses stochastic algorithms, repeated runs may produce different
results. To eliminate randomness, you may set a random number generation seed using
set.seed
prior to executing Core Hunter. In addition, when reproducible
results are desired, it is advised to use step-based stop conditions instead of the
(default) time-based criteria, because runtimes may be affected by external factors,
and, therefore, a different number of steps may have been performed in repeated runs
when using time-based stop conditions.
coreHunterData
, objective
# \donttest{
data <- exampleData()
# maximize entry-to-nearest-entry distance between genotypes and phenotypes (equal weight)
objectives <- list(objective("EN", "MR"), objective("EN", "GD"))
# get normalization ranges for default size (20%)
ranges <- getNormalizationRanges(data, obj = objectives, mode = "fast")
# set normalization ranges and sample core
objectives <- lapply(1:2, function(o){setRange(objectives[[o]], ranges[o,])})
core <- sampleCore(data, obj = objectives)
# }
Run the code above in your browser using DataLab