Genoud is a function that combines evolutionary search
algorithms with derivative-based (Newton or quasi-Newton) methods to
solve difficult optimization problems. Genoud may also be
used for optimization problems for which derivatives do not exist.
Genoud, via the cluster option, supports the use of
multiple computers, CPUs or cores to perform parallel computations.genoud(fn, nvars, max=FALSE, pop.size=1000, max.generations=100, wait.generations=10,
hard.generation.limit=TRUE, starting.values=NULL, MemoryMatrix=TRUE,
Domains=NULL, default.domains=10, solution.tolerance=0.001,
gr=NULL, boundary.enforcement=0, lexical=FALSE, gradient.check=TRUE, BFGS=TRUE,
data.type.int=FALSE, hessian=FALSE, unif.seed=812821, int.seed=53058,
print.level=2, share.type=0, instance.number=0,
output.path="stdout", output.append=FALSE, project.path=NULL,
P1=50, P2=50, P3=50, P4=50, P5=50, P6=50, P7=50, P8=50, P9=0,
P9mix=NULL, BFGSburnin=0, BFGSfn=NULL, BFGShelp=NULL,
control=list(), optim.method=ifelse(boundary.enforcement < 2, "BFGS", "L-BFGS-B"),
transform=FALSE, debug=FALSE, cluster=FALSE, balance=FALSE, ...)TRUE). The first argument of the function must be the
vector of parameters over which minimizing is to
occur. The function must return a scalar result (unless
lexical=TTRUE) or Minimizing (FALSE). Determines
if genoud minimizes or maximizes the objective function.genoud uses to
solve the optimization problem. There are several restrictions on
what the value of this number can be. No matter what population
size the user requests, the numgenoud will run when attempting to optimize a function. This is a
soft limit. The maximum generation limit will be binding for
genoud onlgenoud will think that it has
found the optimum. If the
gradient.check trigger has been
turned on, genoud will onlymax.generations
variable is a binding constraint for genoud. If
hard.generation.limit is FALSE, then genoud may exceed
the max.genegenoud will use at startup. Using this option, the user
may insert one or more individuals into the starting population. If a
matrix is provided, the columns should be the vargenoud sets up a memory matrix. Such a
matrix ensures that genoud will request the fitness evaluation of a
given set of parameters only once. The variable may
be TRUE or Fnvars $\times 2$
matrix. For each variable, in the first column is the lower bound and
in the second column the upper bound. None of genoud's
starting population will be
generated outside of the boundsDomains matrix,
domains may nevertheless be set by the user with this easy to use
scalar option. Genoud will create a
Domains matrix by setting the lower bound for all of the pagenoud. Numbers within
solution.tolerance are considered to be equal. This is
particularly
important when it comes to evaluating wait.generations and
conducting tBFGS
optimizer. If it is NULL, numerical gradients will be used
instead.genoud obeys the
boundary constraints. Notwithstanding the value of the variable,
none of genoud's starting population values will be outside
of the bounds.
boundagenoud returns a list
with 7 objects. 8 objects are returned if the user has requested
the hessian to be calculated at the solution. Please see the
hessian option. The returned objects are:lexical optimization was requested, it is a vector.NA.genoud ran for.genoud found
the solution.genoud actually used.
See pop.size for why this value may differ from the
population size the user requested.genoud used. Please see the Operators Section for details.hessian flag), the hessian
at the solution will be returned. The user may use this matrix to calculate standard
errors.grgenoud never uses derivative
information. This implies that the BFGS quasi-Newton optimizer is
never used---i.e., the BFGS flag is set to FALSE. It
also implies
that Operator 9 (Local-Minimum Crossover) is set to zero and that
gradient checking (as a convergence criterion) is turned off. No
matter what other options have been set to,
data.type.int takes precedence---i.e., if genoud is told that
it is searching over an integer parameter space, gradient
information is never considered.
There is no option to mix integer and floating point parameters. If
one wants to mix the two, it is suggested that the user pick integer type
and in the objective function map a particular integer range into a
floating point number range. For example, tell genoud to search
from 0 to 100 and divide by 100 to obtain a search grid of 0 to 1.0
(by .1).
Alternatively, the user could use floating point numbers and round
the appropriate parameters to the nearest integer inside fn
before the criterion (or criteria if lexical = TRUE) is
evaluated. In that case, the transform option can be used to
create the next generation from the current generation when the
appropriate parameters are in the rounded state.genoud run, genoud will randomly create the necessary individuals. If
the project file contains a larger population than the current genoud
run, genoud will kill the necessary individuals using exponential
selection.
If the number of variables (see nvars)
reported in the project file is different from the current genoud run,
genoud does not use the project file (regardless of the value of
share.type) and genoud generates the necessary starting
population at random.genoud this variable is of limited
use. It is basically there in case a genoud run is being
used to optimize the result of another genoud run (i.e., a
recursive implementation).par <- myTransformation(par)
criter <- myObjective(par)
return( c(criter, par) )
This option is useful when parameter transformations are necessary
because the next generation of the population will be created from
the current generation in the transformed state, rather than the
original state. This option can be used by users to implement their
own operators.
There are some issues that should be kept in mind. This option cannot
be used when data.type.int = TRUE. Also, this option coerces
MemoryMatrix to be FALSE, implying that the cluster
option cannot be used. And, unless BFGSfn is specified, this option coerces
gradient.check to FALSE, BFGS to FALSE,
and P9 to 0. If BFGSfn is specified, that function should
perform the transformation but should only return a scalar fit criterion,
for example:
par <- myTransformation(par)
criter <- myCriterion(par)
return(criter)
Finally, if boundary.enforcement > 0, care must be taken to
assure that the transformed parameters are within the Domains,
otherwise unpredictable results could occur. In this case, the transformations are
checked for consistency with Domains but only in the initial generation
(to avoid an unacceptable loss in computational speed).c("localhost","musil","musil","deckard").
This
vector would create a cluster with four nodes: one on the
localhost another on "deckard" and two on the machine named
"musil". Two nodes on a given machine make sense if the machine
has two or more chips/cores. genoud will setup a SOCK
cluster by a call to makeSOCKcluster. This
will require the user to type in her password for each node as the
cluster is by default created via ssh. One can add on
usernames to the machine name if it differs from the current
shell: "username@musil". Other cluster types, such as PVM and
MPI, which do not require passwords can be created by directly
calling makeCluster, and then passing the
returned cluster object to genoud. For an example of how to
manually setup up a cluster with a direct call to
makeCluster see
Genoud has nine operators that it uses. The integer values which are
assigned to each of these operators (P1$\cdots$P9) are
weights.
Genoud calculates the sum of $s = P1+P2+\cdots+P9$. Each operator is
assigned a weight equal to $W_{n} = \frac{s}{P_{n}}$. The number of
times an operator is called usually equals $c_{n} = W_{n} \times
pop.size$.
Operators 6 (Simple Crossover) and 8 (Heuristic
Crossover) require an even number of individuals to work on---i.e.,
they require two parents. Therefore, the pop.size variable and
the operators sets must be such that these three operators have an
even number of individuals to work with. If this does not occur,
genoud automatically upwardly adjusts the population size to make this
constraint hold.
Strong uniqueness checks have been built into the operators to help
ensure that the operators produce offspring different from their
parents, but this does not always happen.
Note that genoud always keeps the best individual each generation.
genoud's 9 operators are:
Genoud solves problems that are nonlinear or
perhaps even discontinuous in the parameters of the function to be
optimized. When a statistical model's estimating function (for
example, a log-likelihood) is nonlinear in the model's parameters,
the function to be optimized will generally not be globally
concave and may have irregularities such as saddlepoints or
discontinuities. Optimization methods that rely on derivatives of
the objective function may be unable to find any optimum at all.
Multiple local optima may exist, so that there is no guarantee
that a derivative-based method will converge to the global
optimum. On the other hand, algorithms that do not use derivative
information (such as pure genetic algorithms) are for many
problems needlessly poor at local hill climbing. Most statistical
problems are regular in a neighborhood of the solution.
Therefore, for some portion of the search space, derivative
information is useful for such problems. Genoud also works
well for problems that no derivative information exists. For
additional documentation and examples please see
optim.#maximize the sin function
sin1 <- genoud(sin, nvars=1, max=TRUE)
#minimize the sin function
sin2 <- genoud(sin, nvars=1, max=FALSE)
#maximize a univariate normal mixture which looks like a claw
claw <- function(xx) {
x <- xx[1]
y <- (0.46*(dnorm(x,-1.0,2.0/3.0) + dnorm(x,1.0,2.0/3.0)) +
(1.0/300.0)*(dnorm(x,-0.5,.01) + dnorm(x,-1.0,.01) + dnorm(x,-1.5,.01)) +
(7.0/300.0)*(dnorm(x,0.5,.07) + dnorm(x,1.0,.07) + dnorm(x,1.5,.07)))
return(y)
}
claw1 <- genoud(claw, nvars=1,pop.size=3000,max=TRUE)
#Plot the previous run
xx <- seq(-3,3,.05)
plot(xx,lapply(xx,claw),type="l",xlab="Parameter",ylab="Fit",main="GENOUD: Maximize the Claw Density")
points(claw1$par,claw1$value,col="red")
# Maximize a bivariate normal mixture which looks like a claw.
biclaw <- function(xx) {
mNd2 <- function(x1, x2, mu1, mu2, sigma1, sigma2, rho)
{
z1 <- (x1-mu1)/sigma1
z2 <- (x2-mu2)/sigma2
w <- (1.0/(2.0*pi*sigma1*sigma2*sqrt(1-rho*rho)))
w <- w*exp(-0.5*(z1*z1 - 2*rho*z1*z2 + z2*z2)/(1-rho*rho))
return(w)
}
x1 <- xx[1]+1
x2 <- xx[2]+1
y <- (0.5*mNd2(x1,x2,0.0,0.0,1.0,1.0,0.0) +
0.1*(mNd2(x1,x2,-1.0,-1.0,0.1,0.1,0.0) +
mNd2(x1,x2,-0.5,-0.5,0.1,0.1,0.0) +
mNd2(x1,x2,0.0,0.0,0.1,0.1,0.0) +
mNd2(x1,x2,0.5,0.5,0.1,0.1,0.0) +
mNd2(x1,x2,1.0,1.0,0.1,0.1,0.0)))
return(y)
}
biclaw1 <- genoud(biclaw, default.domains=20, nvars=2,pop.size=5000,max=TRUE)
# For more examples see: http://sekhon.berkeley.edu/rgenoud/R.Run the code above in your browser using DataLab