genoud: GENetic Optimization Using Derivatives

Description

Genoud is a function that combines evolutionary algorithm methods with a derivative-based (quasi-Newton) method to solve difficult optimization problems. Genoud may also be used for optimization problems for which derivatives do not exist. Genoud solves problems that are nonlinear or perhaps even discontinuous in the parameters of the function to be optimized. When a statistical model's estimating function (for example, a log-likelihood) is nonlinear in the model's parameters, the function to be optimized will generally not be globally concave and may have irregularities such as saddlepoints or discontinuities. Optimization methods that rely on derivatives of the objective function may be unable to find any optimum at all. Multiple local optima may exist, so that there is no guarantee that a derivative-based method will converge to the global optimum. On the other hand, algorithms that do not use derivative information (such as pure genetic algorithms) are for many problems needlessly poor at local hill climbing. Most statistical problems are regular in a neighborhood of the solution. Therefore, for some portion of the search space, derivative information is useful. Genoud, via the cluster option, supports the use of multiple computers, CPUs or cores to perform parallel computations.

Usage

genoud(fn, nvars, max=FALSE, pop.size=1000, max.generations=100, wait.generations=10,
              hard.generation.limit=TRUE, starting.values=NULL, MemoryMatrix=TRUE, 
              Domains=NULL, default.domains=10, solution.tolerance=0.001,
              gr=NULL, boundary.enforcement=0, lexical=FALSE, gradient.check=TRUE, BFGS=TRUE,
              data.type.int=FALSE, hessian=FALSE, unif.seed=812821, int.seed=53058,
              print.level=2, share.type=0, instance.number=0,
              output.path="stdout", output.append=FALSE, project.path=NULL,
              P1=50, P2=50, P3=50, P4=50, P5=50, P6=50, P7=50, P8=50, P9=0,
              P9mix=NULL, BFGSburnin=0, BFGSfn=NULL, BFGShelp=NULL,
              cluster=FALSE, balance=FALSE, debug=FALSE, control=list(), ...)

Arguments

The function to be minimized (or maximized if max=TRUE). The first argument of the function must be the vector of parameters over which minimizing is to occur. The function must return a scalar result (unless lexical=T

nvars

The number of parameters to be selected for the function to be minimized (or maximized).

max

Maximization (TRUE) or Minimizing (FALSE). Determines if genoud minimizes or maximizes the objective function.

pop.size

Population Size. This is the number of individuals genoud uses to solve the optimization problem. There are several restrictions on what the value of this number can be. No matter what population size the user requests, the num

max.generations

Maximum Generations. This is the maximum number of generations that genoud will run when attempting to optimize a function. This is a soft limit. The maximum generation limit will be binding for genoud onl

wait.generations

If there is no improvement in the objective function in this number of generations, genoud will think that it has found the optimum. If the gradient.check trigger has been turned on, genoud will only

hard.generation.limit

This logical variable determines if the max.generations variable is a binding constraint for genoud. If hard.generation.limit is FALSE, then genoud may exceed the max.gene

starting.values

A vector or matrix containing parameter values which genoud will use at startup. Using this option, the user may insert one or more individuals into the starting population. If a matrix is provided, the columns should be the var

MemoryMatrix

This variable controls if genoud sets up a memory matrix. Such a matrix ensures that genoud will request the fitness evaluation of a given set of parameters only once. The variable may be TRUE or F

Domains

This is a nvars $\times 2$ matrix. For each variable, in the first column is the lower bound and in the second column the upper bound. None of genoud's starting population will be generated outside of the bounds

default.domains

If the user does not want to provide a Domains matrix, domains may nevertheless be set by the user with this easy to use scalar option. Genoud will create a Domains matrix by setting the lower bound for all of the pa

solution.tolerance

This is the tolerance level used by genoud. Numbers within solution.tolerance are considered to be equal. This is particularly important when it comes to evaluating wait.generations and conducting t

A function to provide the gradient for the BFGS optimizer. If it is NULL, numerical gradients will be used instead.

boundary.enforcement

This variable determines the degree to which genoud obeys the boundary constraints. Notwithstanding the value of the variable, none of genoud's starting population values will be outside of the bounds. bounda

Value

genoud returns a list with 7 objects. 8 objects are returned if the user has requested the hessian to be calculated at the solution. Please see the hessian option. The returned objects are:
valueThis variable contains the fitness value at the solution. If lexical optimization was requested, it is a vector.
parThis vector contains the parameter values found at the solution.
gradientsThis vector contains the gradients found at the solution. If no gradients were calculated, they are reported to be NA.
generationsThis variable contains the number of generations genoud ran for.
peakgenerationThis variable contains the generation number at which genoud found the solution.
pop.sizeThis variable contains the population size that genoud actually used. See pop.size for why this value may differ from the population size the user requested.
operatorsThis vector reports the actual number of operators (of each type) genoud used. Please see the Operators Section for details.
hessianIf the user has requested the hessian matrix to be returned (via the hessian flag), the hessian at the solution will be returned. The user may use this matrix to calculate standard errors.

item

lexical
gradient.check
BFGS
data.type.int
hessian
unif.seed
int.seed
print.level
share.type
instance.number
output.path
output.append
project.path
P1
P2
P3
P4
P5
P6
P7
P8
P9
P9mix
BFGSburnin
BFGSfn
BFGShelp
cluster
balance
debug
control
...

code

gr

cr

With integer parameters, genoud never uses derivative information. This implies that the BFGS quasi-Newton optimizer is never used---i.e., the BFGS flag is set to FALSE. It also implies that Operator 9 (Local-Minimum Crossover) is set to zero and that gradient checking (as a convergence criterion) is turned off. No matter what other options have been set to, data.type.int takes precedence---i.e., if genoud is told that it is searching over an integer parameter space, gradient information is never considered. There is no option to mix integer and floating point parameters. If one wants to mix the two, it is suggested that the user pick integer type and in the objective function map a particular integer range into a floating point number range. For example, tell genoud to search from 0 to 100 and divide by 100 to obtain a search grid of 0 to 1.0 (by .1). Alternatively, one could use lexical optimization with floating point parameters and specify, for example, that the first criterion is a binary variable that equals 1.0 if and only if the parameters that should be ``integers'' are all sufficiently close to an integer so that rounding them to the nearest integer is tolerable for the purposes of calculating subsequent criteria. See also BFGSfn.
If the project file contains a smaller population than the current genoud run, genoud will randomly create the necessary individuals. If the project file contains a larger population than the current genoud run, genoud will kill the necessary individuals using exponential selection. If the number of variables (see nvars) reported in the project file is different from the current genoud run, genoud does not use the project file (regardless of the value of share.type) and genoud generates the necessary starting population at random.
For the R version of genoud this variable is of limited use. It is basically there in case a genoud run is being used to optimize the result of another genoud run (i.e., a recursive implementation).
c("localhost","musil","musil","deckard"). This vector would create a cluster with four nodes: one on the localhost another on "deckard" and two on the machine named "musil". Two nodes on a given machine make sense if the machine has two or more chips/cores. genoud will setup a SOCK cluster by a call to makeSOCKcluster. This will require the user to type in her password for each node as the cluster is by default created via ssh. One can add on usernames to the machine name if it differs from the current shell: "username@musil". Other cluster types, such as PVM and MPI, which do not require passwords can be created by directly calling makeCluster, and then passing the returned cluster object to genoud. For an example of how to manually setup up a cluster with a direct call to makeCluster see http://sekhon.berkeley.edu/rgenoud/R/genoud_cluster_manual.R. For an example of how to get around a firewall by ssh tunneling see: http://sekhon.berkeley.edu/rgenoud/R/genoud_cluster_manual_tunnel.R.

Operators

Genoud has nine operators that it uses. The integer values which are assigned to each of these operators (P1$\cdots$P9) are weights. Genoud calculates the sum of $s = P1+P2+\cdots+P9$. Each operator is assigned a weight equal to $W_{n} = \frac{s}{P_{n}}$. The number of times an operator is called usually equals $c_{n} = W_{n} \times pop.size$. Operators 6 (Simple Crossover) and 8 (Heuristic Crossover) require an even number of individuals to work on---i.e., they require two parents. Therefore, the pop.size variable and the operators sets must be such that these three operators have an even number of individuals to work with. If this does not occur, genoud automatically upwardly adjusts the population size to make this constraint hold. Strong uniqueness checks have been built into the operators to help ensure that the operators produce offspring different from their parents, but this does not always happen. Note that genoud always keeps the best individual each generation. genoud's 9 operators are:

Cloning
Uniform Mutation
Boundary Mutation
Non-Uniform Crossover
Polytope Crossover
Simple Crossover
Whole Non-Uniform Mutation
Heuristic Crossover
Local-Minimum Crossover: BFGS

For more information please see Table 1 of the reference article: http://sekhon.berkeley.edu/papers/rgenoudJSS.pdf.

References

Mebane, Walter R. Jr. and Jasjeet Singh Sekhon. 2007. ``Genetic Optimization Using Derivatives: The rgenoud package for R.'' http://sekhon.berkeley.edu/papers/rgenoudJSS.pdf Sekhon, Jasjeet Singh and Walter R. Mebane, Jr. 1998. ``Genetic Optimization Using Derivatives: Theory and Application to Nonlinear Models.'' Political Analysis, 7: 187-210. http://sekhon.berkeley.edu/genoud/genoud.pdf Mebane, Walter R., Jr. and Jasjeet S. Sekhon. 2004. ``Robust Estimation and Outlier Detection for Overdispersed Multinomial Models of Count Data.'' American Journal of Political Science, 48 (April): 391-410. http://sekhon.berkeley.edu/multinom.pdf Bright, H. and R. Enison. 1979. Quasi-Random Number Sequences from a Long-Period TLP Generator with Remarks on Application to Cryptography. Computing Surveys, 11(4): 357-370.

Examples

Run this code

#maximize the sin function
 sin1 <- genoud(sin, nvars=1, max=TRUE)

#minimize the sin function
 sin2 <- genoud(sin, nvars=1, max=FALSE)

#maximize a univariate normal mixture which looks like a claw
 claw <- function(xx) {
   x <- xx[1]
   y <- (0.46*(dnorm(x,-1.0,2.0/3.0) + dnorm(x,1.0,2.0/3.0)) +
   (1.0/300.0)*(dnorm(x,-0.5,.01) + dnorm(x,-1.0,.01) + dnorm(x,-1.5,.01)) +
   (7.0/300.0)*(dnorm(x,0.5,.07) + dnorm(x,1.0,.07) + dnorm(x,1.5,.07))) 
   return(y)
 }
 claw1   <- genoud(claw, nvars=1,pop.size=3000,max=TRUE)

#Plot the previous run
 xx <- seq(-3,3,.05)
 plot(xx,lapply(xx,claw),type="l",xlab="Parameter",ylab="Fit",main="GENOUD: Maximize the Claw Density")
 points(claw1$par,claw1$value,col="red")

# Maximize a bivariate normal mixture which looks like a claw.
 biclaw <- function(xx) {
  mNd2 <- function(x1, x2, mu1, mu2, sigma1, sigma2, rho)
    {
      z1 <- (x1-mu1)/sigma1
      z2 <- (x2-mu2)/sigma2
      w <- (1.0/(2.0*pi*sigma1*sigma2*sqrt(1-rho*rho))) 
      w <- w*exp(-0.5*(z1*z1 - 2*rho*z1*z2 + z2*z2)/(1-rho*rho)) 
      return(w)
    }
  x1 <- xx[1]+1
  x2 <- xx[2]+1
  
  y <- (0.5*mNd2(x1,x2,0.0,0.0,1.0,1.0,0.0) +
	    0.1*(mNd2(x1,x2,-1.0,-1.0,0.1,0.1,0.0) +
		 mNd2(x1,x2,-0.5,-0.5,0.1,0.1,0.0) +
		 mNd2(x1,x2,0.0,0.0,0.1,0.1,0.0) +
		 mNd2(x1,x2,0.5,0.5,0.1,0.1,0.0) +
		 mNd2(x1,x2,1.0,1.0,0.1,0.1,0.0)))

  return(y)
 }
 biclaw1 <- genoud(biclaw, default.domains=20, nvars=2,pop.size=5000,max=TRUE)
# For more examples see: http://sekhon.berkeley.edu/rgenoud/R.

Run the code above in your browser using DataLab