Genoud
is a function that combines evolutionary algorithm
methods with a derivative-based (quasi-Newton) method to solve
difficult optimization problems. Genoud
may also be used for
optimization problems for which derivatives do not exist.
Genoud
solves problems that are nonlinear or perhaps even
discontinuous in the parameters of the function to be optimized.
When a statistical model's estimating function (for example, a
log-likelihood) is nonlinear in the model's parameters, the function
to be optimized will generally not be globally concave and may have
irregularities such as saddlepoints or discontinuities.
Optimization methods that rely on derivatives of the objective
function may be unable to find any optimum at all. Multiple local
optima may exist, so that there is no guarantee that a
derivative-based method will converge to the global optimum. On the
other hand, algorithms that do not use derivative information (such
as pure genetic algorithms) are for many problems needlessly poor at
local hill climbing. Most statistical problems are regular in a
neighborhood of the solution. Therefore, for some portion of the
search space, derivative information is useful. Genoud
, via
the cluster
option, supports the use of multiple computers,
CPUs or cores to perform parallel computations.genoud(fn, nvars, max=FALSE, pop.size=1000, max.generations=100, wait.generations=10,
hard.generation.limit=TRUE, starting.values=NULL, MemoryMatrix=TRUE,
Domains=NULL, default.domains=10, solution.tolerance=0.001,
gr=NULL, boundary.enforcement=0, lexical=FALSE, gradient.check=TRUE, BFGS=TRUE,
data.type.int=FALSE, hessian=FALSE, unif.seed=812821, int.seed=53058,
print.level=2, share.type=0, instance.number=0,
output.path="stdout", output.append=FALSE, project.path=NULL,
P1=50, P2=50, P3=50, P4=50, P5=50, P6=50, P7=50, P8=50, P9=0,
cluster=FALSE, balance=FALSE, debug=FALSE, ...)
TRUE
). The first argument of the function must be the
vector of parameters over which minimizing is to
occur. The function must return a scalar result.
For example, if wTRUE
) or Minimizing (FALSE
). Determines
if genoud
minimizes or maximizes the objective function.genoud
uses to
solve the optimization problem. There are several restrictions on
what the value of this number can be. No matter what population
size the user requests, the numgenoud
will run when attempting to optimize a function. This is a
soft limit. The maximum generation limit will be binding for
genoud
onlgenoud
will think that it has
found the optimum. If the
gradient.check
trigger has been
turned on, genoud
will onlymax.generations
variable is a binding constraint for genoud
. If
hard.generation.limit
is FALSE
, then genoud
may exceed
the max.gene
genoud
will use
at startup. The starting.values
vector is a way for the user
to insert one individual into the starting population.
genoud
will rangenoud
sets up a memory matrix. Such a
matrix ensures that genoud
will request the fitness evaluation of a
given set of parameters only once. The variable may
be TRUE
or F
nvars
$\times 2$
matrix. For each variable, in the first column is the lower bound and
in the second column the upper bound. None of genoud
's
starting population will be
generated outside of the boundsDomains
matrix,
domains may nevertheless be set by the user with this easy to use
scalar option. Genoud
will create a
Domains matrix by setting the lower bound for all of the pagenoud
. Numbers within
solution.tolerance
are considered to be equal. This is
particularly
important when it comes to evaluating wait.generations
and
conducting tBFGS
optimizer. If it is NULL
, numerical gradients will be used
instead.genoud
obeys the
boundary constraints. Notwithstanding the value of the variable,
none of genoud
's starting population values will be outside
of the bounds. But some of tgenoud
returns a list with 7 objects. 8 objects are returned if the
user has requested the hessian to be calculated at the
solution. Please see the hessian
option. The returned
objects are:lexical
optimization was requested, it is a vector.NA
.genoud
ran for.genoud
found
the solution.genoud
actually used.
See pop.size
for why this value may differ from the
population size the user requested.genoud
used. Please see the Operators Section for details.hessian
flag), the hessian
at the solution will be returned. The user may use this matrix to calculate standard
errors.gr
genoud
never uses derivative
information. This implies that the BFGS quasi-Newton optimizer is
never used---i.e., the BFGS
flag is set to FALSE
. It
also implies
that Operator 9 (Local-Minimum Crossover) is set to zero and that
gradient checking (as a convergence criterion) is turned off. No
matter what other options have been set to,
data.type.int
takes precedence---i.e., if genoud
is told that
it is searching over an integer parameter space, gradient
information is never considered.
There is no option to mix integer and floating point parameters. If
one wants
to mix the two, it is suggested that the user pick integer type and in
the objective function map a particular integer range into a
floating point number range. For example, tell genoud
to search
from 0 to 100 and divide by 100 to obtain a search grid of 0 to 1.0
(by .1).genoud
run, genoud
will randomly create the necessary individuals. If
the project file contains a larger population than the current genoud
run, genoud
will kill the necessary individuals using exponential
selection.
If the number of variables (see nvars
)
reported in the project file is different from the current genoud
run,
genoud
does not use the project file (regardless of the value of
share.type
) and genoud
generates the necessary starting
population at random.genoud
this variable is of limited use.
It is basically there in case a genoud
run is being used to optimize
the result of another genoud
run (i.e., a recursive
implementation).c("localhost","musil","musil","deckard")
.
This vector would create a cluster with four nodes: one on the
localhost another on "deckard" and two on the machine named
"musil". Two nodes on a given machine make sense if the machine has two or more
chips/cores. genoud
will setup a SOCK cluster by a call to
makeSOCKcluster
. This will require the user
to type in her password for each node as the cluster is by default
created via ssh
. One can add on usernames to the machine
name if it differs from the current shell: "username@musil". Other
cluster types, such as PVM and MPI,
which do not require passwords can be created by directly calling
makeCluster
, and then passing the returned
cluster object to genoud
. For an example of how to manually setup up
a cluster with a direct call to makeCluster
see
Genoud
has nine operators that it uses. The integer values which are
assigned to each of these operators (P1$\cdots$P9) are
weights.
Genoud
calculates the sum of $s = P1+P2+\cdots+P9$. Each operator is
assigned a weight equal to $W_{n} = \frac{s}{P_{n}}$. The number of
times an operator is called usually equals $c_{n} = W_{n} \times
pop.size$.
Operators 6 (Simple Crossover) and 8 (Heuristic
Crossover) require an even number of individuals to work on---i.e.,
they require two parents. Therefore, the pop.size
variable and
the operators sets must be such that these three operators have an
even number of individuals to work with. If this does not occur,
genoud
automatically upwardly adjusts the population size to make this
constraint hold.
Strong uniqueness checks have been built into the operators to help
ensure that the operators produce offspring different from their
parents, but this does not always happen.
Note that genoud
always keeps the best individual each generation.
genoud
's 9 operators are:
optim
.#maximize the sin function
sin1 <- genoud(sin, nvars=1, max=TRUE)
#minimize the sin function
sin2 <- genoud(sin, nvars=1, max=FALSE)
#maximize a univariate normal mixture which looks like a claw
claw <- function(xx) {
x <- xx[1]
y <- (0.46*(dnorm(x,-1.0,2.0/3.0) + dnorm(x,1.0,2.0/3.0)) +
(1.0/300.0)*(dnorm(x,-0.5,.01) + dnorm(x,-1.0,.01) + dnorm(x,-1.5,.01)) +
(7.0/300.0)*(dnorm(x,0.5,.07) + dnorm(x,1.0,.07) + dnorm(x,1.5,.07)))
return(y)
}
claw1 <- genoud(claw, nvars=1,pop.size=3000,max=TRUE)
#Plot the previous run
xx <- seq(-3,3,.05)
plot(xx,lapply(xx,claw),type="l",xlab="Parameter",ylab="Fit",main="GENOUD: Maximize the Claw Density")
points(claw1$par,claw1$value,col="red")
# Maximize a bivariate normal mixture which looks like a claw.
biclaw <- function(xx) {
mNd2 <- function(x1, x2, mu1, mu2, sigma1, sigma2, rho)
{
z1 <- (x1-mu1)/sigma1
z2 <- (x2-mu2)/sigma2
w <- (1.0/(2.0*pi*sigma1*sigma2*sqrt(1-rho*rho)))
w <- w*exp(-0.5*(z1*z1 - 2*rho*z1*z2 + z2*z2)/(1-rho*rho))
return(w)
}
x1 <- xx[1]+1
x2 <- xx[2]+1
y <- (0.5*mNd2(x1,x2,0.0,0.0,1.0,1.0,0.0) +
0.1*(mNd2(x1,x2,-1.0,-1.0,0.1,0.1,0.0) +
mNd2(x1,x2,-0.5,-0.5,0.1,0.1,0.0) +
mNd2(x1,x2,0.0,0.0,0.1,0.1,0.0) +
mNd2(x1,x2,0.5,0.5,0.1,0.1,0.0) +
mNd2(x1,x2,1.0,1.0,0.1,0.1,0.0)))
return(y)
}
biclaw1 <- genoud(biclaw, default.domains=20, nvars=2,pop.size=5000,max=TRUE)
# For more examples see: http://sekhon.berkeley.edu/rgenoud/R.
Run the code above in your browser using DataLab