The following optimization objectives are supported by Core Hunter:
EN
Average entry-to-nearest-entry distance (default). Maximizes the average distance between each selected individual and the closest other selected item in the core. Favors diverse cores in which each individual is sufficiently different from the most similar other selected item (low redundancy). Multiple distance measures are provided to be used with this objective (see below).
AN
Average accession-to-nearest-entry distance. Minimizes the average distance between each individual (from the full dataset) and the closest selected item in the core (which can be the individual itself). Favors representative cores in which all items from the original dataset are represented by similar individuals in the selected subset. Multiple distance measures are provided to be used with this objective (see below).
EE
Average entry-to-entry distance. Maximizes the average distance between
each pair of selected individuals in the core. This objective is related to
the entry-to-nearest-entry (EN) distance but less effectively avoids redundant,
similar individuals in the core. In general, use of EN
is preferred.
Multiple distance measures are provided to be used with this objective (see below).
SH
Shannon's allelic diversity index. Maximizes the entropy, as used in information theory, of the selected core. Independently takes into account all allele frequencies, regardless of the locus (marker) where to which the allele belongs. Requires genotypes.
HE
Expected proportion of heterozygous loci. Maximizes the expected proportion of heterozygous
loci in offspring produced from random crossings within the selected core. In contrast to
Shannon's index (SH
) this objective treats each marker (locus) with equal importance,
regardless of the number of possible alleles for that marker. Requires genotypes.
CV
Allele coverage. Maximizes the proportion of alleles observed in the full dataset that are retained in the selected core. Requires genotypes.
The first three objective types (EN
, AN
and EE
) aggregate pairwise distances
between individuals. These distances can be computed using various measures:
MR
Modified Rogers distance (default). Requires genotypes.
CE
Cavalli-Sforza and Edwards distance. Requires genotypes.
GD
Gower distance. Requires phenotypes.
PD
Precomputed distances. Uses the precomputed distance matrix of the dataset.
objective(
type = c("EN", "AN", "EE", "SH", "HE", "CV"),
measure = c("MR", "CE", "GD", "PD"),
weight = 1,
range = NULL
)
Core Hunter objective of class chobj
with elements
type
Objective type.
meas
Distance measure (if applicable).
weight
Assigned weight.
range
Normalization range (if specified).
Objective type, one of EN
(default), AN
, EE
,
SH
, HE
or CV
(see description). The former three
objectives are distance based and require to choose a distance
measure
. By default, Modified Roger's distance is used,
computed from the genotypes.
Distance measure used to compute the distance between two
individuals, one of MR
(default), CE
, GD
or PD
(see description). Ignored when type
is SH
, HE
or
CV
.
Weight assigned to the objective when maximizing a weighted index. Defaults to 1.0.
Normalization range [l,u] of the objective when maximizing a weighted
index. By default the range is not set (NULL
) and will be determined
automatically prior to execution, if normalization is enabled (default).
Values are rescaled to [0,1] with the linear formula
\(
v' = (v - l)/(u - l)
\).
When an explicit normalization range is set, it overrides the automatically inferred
range. Also, setting the range for all included objectives reduces the computation time
when sampling a multi-objective core collection. In case of repeated sampling from the
same dataset with the same objectives and size, it is therefore advised to determine the
normalization ranges only once using getNormalizationRanges
so that
they can be reused for all executions.
getNormalizationRanges
, setRange
objective()
objective(meas = "PD")
objective("EE", "GD")
objective("HE")
objective("EN", "MR", range = c(0.150, 0.300))
objective("AN", "MR", weight = 0.5, range = c(0.150, 0.300))
Run the code above in your browser using DataLab