Genoud is a function that combines evolutionary algorithm
methods with a derivative-based (quasi-Newton) method to solve
difficult optimization problems. Genoud may also be used for
optimization problems for which derivatives do not exist.
Genoud solves problems that are nonlinear or perhaps even
discontinuous in the parameters of the function to be optimized.
When a statistical model's estimating function (for example, a
log-likelihood) is nonlinear in the model's parameters, the function
to be optimized will generally not be globally concave and may have
irregularities such as saddlepoints or discontinuities.
Optimization methods that rely on derivatives of the objective
function may be unable to find any optimum at all. Multiple local
optima may exist, so that there is no guarantee that a
derivative-based method will converge to the global optimum. On the
other hand, algorithms that do not use derivative information (such
as pure genetic algorithms) are for many problems needlessly poor at
local hill climbing. Most statistical problems are regular in a
neighborhood of the solution. Therefore, for some portion of the
search space, derivative information is useful. Genoud, via
the cluster option, supports the use of multiple computers,
CPUs or cores to perform parallel computations.genoud(fn, nvars, max=FALSE, pop.size=1000, max.generations=100, wait.generations=10,
hard.generation.limit=TRUE, starting.values=NULL, MemoryMatrix=TRUE,
Domains=NULL, default.domains=10, solution.tolerance=0.001,
gr=NULL, boundary.enforcement=0, lexical=FALSE, gradient.check=TRUE, BFGS=TRUE,
data.type.int=FALSE, hessian=FALSE, unif.seed=812821, int.seed=53058,
print.level=2, share.type=0, instance.number=0,
output.path="stdout", output.append=FALSE, project.path=NULL,
P1=50, P2=50, P3=50, P4=50, P5=50, P6=50, P7=50, P8=50, P9=0,
P9mix=NULL, BFGSburnin=0, BFGSfn=NULL, BFGShelp=NULL,
cluster=FALSE, balance=FALSE, debug=FALSE, control=list(), ...)TRUE). The first argument of the function must be the
vector of parameters over which minimizing is to
occur. The function must return a scalar result (unless
lexical=TTRUE) or Minimizing (FALSE). Determines
if genoud minimizes or maximizes the objective function.genoud uses to
solve the optimization problem. There are several restrictions on
what the value of this number can be. No matter what population
size the user requests, the numgenoud will run when attempting to optimize a function. This is a
soft limit. The maximum generation limit will be binding for
genoud onlgenoud will think that it has
found the optimum. If the
gradient.check trigger has been
turned on, genoud will onlymax.generations
variable is a binding constraint for genoud. If
hard.generation.limit is FALSE, then genoud may exceed
the max.genegenoud will use at startup. Using this option, the user
may insert one or more individuals into the starting population. If a
matrix is provided, the columns should be the vargenoud sets up a memory matrix. Such a
matrix ensures that genoud will request the fitness evaluation of a
given set of parameters only once. The variable may
be TRUE or Fnvars $\times 2$
matrix. For each variable, in the first column is the lower bound and
in the second column the upper bound. None of genoud's
starting population will be
generated outside of the boundsDomains matrix,
domains may nevertheless be set by the user with this easy to use
scalar option. Genoud will create a
Domains matrix by setting the lower bound for all of the pagenoud. Numbers within
solution.tolerance are considered to be equal. This is
particularly
important when it comes to evaluating wait.generations and
conducting tBFGS
optimizer. If it is NULL, numerical gradients will be used
instead.genoud obeys the
boundary constraints. Notwithstanding the value of the variable,
none of genoud's starting population values will be outside
of the bounds.
boundagenoud returns a list with 7 objects. 8 objects are returned if the
user has requested the hessian to be calculated at the
solution. Please see the hessian option. The returned
objects are:lexical optimization was requested, it is a vector.NA.genoud ran for.genoud found
the solution.genoud actually used.
See pop.size for why this value may differ from the
population size the user requested.genoud used. Please see the Operators Section for details.hessian flag), the hessian
at the solution will be returned. The user may use this matrix to calculate standard
errors.grgenoud never uses derivative
information. This implies that the BFGS quasi-Newton optimizer is
never used---i.e., the BFGS flag is set to FALSE. It
also implies
that Operator 9 (Local-Minimum Crossover) is set to zero and that
gradient checking (as a convergence criterion) is turned off. No
matter what other options have been set to,
data.type.int takes precedence---i.e., if genoud is told that
it is searching over an integer parameter space, gradient
information is never considered.
There is no option to mix integer and floating point parameters. If
one wants to mix the two, it is suggested that the user pick integer type
and in the objective function map a particular integer range into a
floating point number range. For example, tell genoud to search
from 0 to 100 and divide by 100 to obtain a search grid of 0 to 1.0
(by .1).
Alternatively, one could use lexical optimization with floating point
parameters and specify, for example, that the first criterion is a binary
variable that equals 1.0 if and only if the parameters that should be
``integers'' are all sufficiently close to an integer so that rounding them to
the nearest integer is tolerable for the purposes of calculating
subsequent criteria. See also BFGSfn.genoud run, genoud will randomly create the necessary individuals. If
the project file contains a larger population than the current genoud
run, genoud will kill the necessary individuals using exponential
selection.
If the number of variables (see nvars)
reported in the project file is different from the current genoud run,
genoud does not use the project file (regardless of the value of
share.type) and genoud generates the necessary starting
population at random.genoud this variable is of limited use.
It is basically there in case a genoud run is being used to optimize
the result of another genoud run (i.e., a recursive
implementation).c("localhost","musil","musil","deckard").
This vector would create a cluster with four nodes: one on the
localhost another on "deckard" and two on the machine named
"musil". Two nodes on a given machine make sense if the machine has two or more
chips/cores. genoud will setup a SOCK cluster by a call to
makeSOCKcluster. This will require the user
to type in her password for each node as the cluster is by default
created via ssh. One can add on usernames to the machine
name if it differs from the current shell: "username@musil". Other
cluster types, such as PVM and MPI,
which do not require passwords can be created by directly calling
makeCluster, and then passing the returned
cluster object to genoud. For an example of how to manually setup up
a cluster with a direct call to makeCluster see
Genoud has nine operators that it uses. The integer values which are
assigned to each of these operators (P1$\cdots$P9) are
weights.
Genoud calculates the sum of $s = P1+P2+\cdots+P9$. Each operator is
assigned a weight equal to $W_{n} = \frac{s}{P_{n}}$. The number of
times an operator is called usually equals $c_{n} = W_{n} \times
pop.size$.
Operators 6 (Simple Crossover) and 8 (Heuristic
Crossover) require an even number of individuals to work on---i.e.,
they require two parents. Therefore, the pop.size variable and
the operators sets must be such that these three operators have an
even number of individuals to work with. If this does not occur,
genoud automatically upwardly adjusts the population size to make this
constraint hold.
Strong uniqueness checks have been built into the operators to help
ensure that the operators produce offspring different from their
parents, but this does not always happen.
Note that genoud always keeps the best individual each generation.
genoud's 9 operators are:
optim.#maximize the sin function
sin1 <- genoud(sin, nvars=1, max=TRUE)
#minimize the sin function
sin2 <- genoud(sin, nvars=1, max=FALSE)
#maximize a univariate normal mixture which looks like a claw
claw <- function(xx) {
x <- xx[1]
y <- (0.46*(dnorm(x,-1.0,2.0/3.0) + dnorm(x,1.0,2.0/3.0)) +
(1.0/300.0)*(dnorm(x,-0.5,.01) + dnorm(x,-1.0,.01) + dnorm(x,-1.5,.01)) +
(7.0/300.0)*(dnorm(x,0.5,.07) + dnorm(x,1.0,.07) + dnorm(x,1.5,.07)))
return(y)
}
claw1 <- genoud(claw, nvars=1,pop.size=3000,max=TRUE)
#Plot the previous run
xx <- seq(-3,3,.05)
plot(xx,lapply(xx,claw),type="l",xlab="Parameter",ylab="Fit",main="GENOUD: Maximize the Claw Density")
points(claw1$par,claw1$value,col="red")
# Maximize a bivariate normal mixture which looks like a claw.
biclaw <- function(xx) {
mNd2 <- function(x1, x2, mu1, mu2, sigma1, sigma2, rho)
{
z1 <- (x1-mu1)/sigma1
z2 <- (x2-mu2)/sigma2
w <- (1.0/(2.0*pi*sigma1*sigma2*sqrt(1-rho*rho)))
w <- w*exp(-0.5*(z1*z1 - 2*rho*z1*z2 + z2*z2)/(1-rho*rho))
return(w)
}
x1 <- xx[1]+1
x2 <- xx[2]+1
y <- (0.5*mNd2(x1,x2,0.0,0.0,1.0,1.0,0.0) +
0.1*(mNd2(x1,x2,-1.0,-1.0,0.1,0.1,0.0) +
mNd2(x1,x2,-0.5,-0.5,0.1,0.1,0.0) +
mNd2(x1,x2,0.0,0.0,0.1,0.1,0.0) +
mNd2(x1,x2,0.5,0.5,0.1,0.1,0.0) +
mNd2(x1,x2,1.0,1.0,0.1,0.1,0.0)))
return(y)
}
biclaw1 <- genoud(biclaw, default.domains=20, nvars=2,pop.size=5000,max=TRUE)
# For more examples see: http://sekhon.berkeley.edu/rgenoud/R.Run the code above in your browser using DataLab