These functions represent 'callbacks'. They can be used in the
function GenAlg
, which creates objects. They will then
be called repeatedly (for each individual in the population) each time
the genetic algorithm is updated to the next generation.
The simpleMutate
function assumes that chromosomes are binary
vectors, so alleles simply take on the value 0 or 1. A mutation of an
allele, therefore, flips its state between those two possibilities.
The selectionMutate
and selectionFitness
functions, by
contrast, are specialized to perform feature selection assuming a
fixed number K of features, with a goal of learning how to
distinguish between two different groups of samples. We assume that
the underlying data consists of a data frame (or matrix), with the
rows representing features (such as genes) and the columns
representing samples. In addition, there must be a grouping vector
(or factor) that assigns all of the sample columns to one of two
possible groups. These data are collected into a list,
context
, containing a dataset
matrix and a gps
factor. An individual member of the population of potential
solutions is encoded as a length K vector of indices into the rows
of the dataset
. An individual allele
, therefore, is a
single index identifying a row of the dataset
. When mutating
it, we assume that it can be changed into any other possible allele;
i.e., any other row number. To compute the fitness, we use the
Mahalanobis distance between the centers of the two groups defined by
the gps
factor.