lab.optimize
is the front-end to a series of heuristic optimization routines (see below), all of which seek to maximize/minimize some bivariate graph statistic (e.g., graph correlation) across a set of vertex relabelings.lab.optimize(d1, d2, FUN, exchange.list=0, seek="min",
opt.method=c("anneal", "exhaustive", "mc", "hillclimb",
"gumbel"), ...)
lab.optimize.anneal(d1, d2, FUN, exchange.list=0, seek="min",
prob.init=1, prob.decay=0.99, freeze.time=1000,
full.neighborhood=TRUE, ...)
lab.optimize.exhaustive(d1, d2, FUN, exchange.list=0, seek="min", ...)
lab.optimize.gumbel(d1, d2, FUN, exchange.list=0, seek="min",
draws=500, tol=1e-5, estimator="median", ...)
lab.optimize.hillclimb(d1, d2, FUN, exchange.list=0, seek="min", ...)
lab.optimize.mc(d1, d2, FUN, exchange.list=0, seek="min",
draws=1000, ...)
lab.optimize.anneal
only)lab.optimize.anneal
only)lab.optimize.anneal
only)lab.optimize.anneal
only)lab.optimize.gumbel
only)lab.optimize.gumbel
only)FUN
FUN
over the set of relabelings permitted by exchange.list
lab.optimize
is the front-end to a family of routines for optimizing a bivariate graph statistic over a set of permissible relabelings (or equivalently, permutations). The accessible permutation set is determined by the exchange.list
argument, which is dealt with in the following manner. First, exchange.list
is expanded to fill an nx2 matrix. If exchange.list
is a single number, this is trivially accomplished by replication; if exchange.list
is a vector of length n, the matrix is formed by cbinding two copies together. If exchange.list
is already an nx2 matrix, it is left as-is. Once the nx2 exchangeabiliy matrix has been formed, it is interpreted as follows: columns refer to graphs 1 and 2, respectively; rows refer to their corresponding vertices in the original adjacency matrices; and vertices are taken to be theoretically exchangeable iff their corresponding exchangeability matrix values are identical. To obtain an unlabeled graph statistic (the default), then, one could simply let exchange.list
equal any single number. To obtain the labeled statistic, one would use the vector 1:n
.Assuming a non-degenerate set of accessible permutations/relabelings, optimization proceeds via the algorithm specified in opt.method
. The optimization routines which are currently implemented use a variety of different techniques, each with certain advantages and disadvantages. A brief summary of each is as follows:
Approximate complexity: on the order of$\prod_{i \in L}|V_i|!$, where L is the set of exchangeability classes.
Approximate complexity: on the order of$|V(G)|^2$per iteration, total complexity dependent on the number of iterations.
full.neighborhood==TRUE
) or a random selection from this set is evaluated. If a superior option is identified, the best of these is chosen. If no superior options are found, then the algorithm chooses randomly from the set of alternatives with probability equal to the current temperature, otherwise retaining its prior solution. After each iteration, the current temperature is reduced by a factor equal toprob.decay
; the initial temperature is set byprob.init
. When a number of iterations equal tofreeze.time
have been completed, the algorithm ``freezes.'' Once ``frozen,'' the annealer hillclimbs from its present location until no improvement is found, and terminates. At termination, the best permutation identified so far is utilized; this need not be the most recent position (though it sometimes is). Simulated annealing is sometimes called ``noisy hill climbing'' because it uses the introduction of random variation to a hill climbing routine to avoid convergence to local optima; it works well on reasonably correlated search spaces with well-defined solution neighborhoods, and is far more robust than hill climbing algorithms. As a general rule, simulated annealing is recommended here for most graphs up to size approximately 50. At this point, computational complexity begins to become a serious barrier, and alternative methods may be more practical.
Approximate complexity: on the order of$|V(G)|^2$*freeze.time
iffull.neighborhood==TRUE
, otherwise complexity scales approximately linearly withfreeze.time
. This can be misleading, however, since failing to search the full neighborhood generally requires thatfreeze.time
be greatly increased.)
Approximate complexity: linear indraws
.
estimator
) is then used as an estimator of the global optimum.Obviously, this approach has certain drawbacks. First of all, our use of the gumbel model in particular assumes an unbounded, continuous underlying distribution, which may or may not be approximately true for any given problem. Secondly, the inherent non-robustness of extremal problems makes the fact that our prediction rests on a string of approximations rather worrisome: our idea of the shape of the underlying distribution could be distorted by a bad sample, our parameter estimation could be somewhat off, etc., any of which could have serious consequences for our extremal prediction. Finally, the prediction which is made by the extreme value model isnonconstructive, in the sense thatno permutation need have been found by the algorithm which induces the predicted value. On the bright side, thiscouldallow one to estimate the optimum without having to find it directly; on the dark side, this means that the reported optimum could be a numerical chimera.
At this time, extreme value estimation should be consideredexperimental, andis not recommended for use on substantive problems.lab.optimize.gumbel
is not guaranteed to work properly, or to produce intelligible results; this may eventually change in future revisions, or the routine may be scrapped altogether.
Approximate complexity: linear indraws
.
This list of algorithms is itself somewhat unstable: some additional techniques (canonical labeling and genetic algorithms, for instance) may be added, and some existing methods (e.g., extreme value estimation) may be modified or removed. Every attempt will be made to keep the command format as stable as possible for other routines (e.g., gscov
, structdist
) which depend on lab.optimize
to do their heavy-lifting. In general, it is not expected that the end-user will call lab.optimize
directly; instead, most end-user interaction with these routines will be via the structural distance/covariance functions which used them.
gscov
, gscor
, structdist
, sdmat
#Generate a random graph and copy it
g<-rgraph(10)
g2<-rmperm(g) #Permute the copy randomly
#Seek the maximum correlation
lab.optimize(g,g2,gcor,seek="max",opt.method="anneal",freeze.time=50,
prob.decay=0.9)
#These two don't do so well...
lab.optimize(g,g2,gcor,seek="max",opt.method="hillclimb")
lab.optimize(g,g2,gcor,seek="max",opt.method="mc",draws=1000)
Run the code above in your browser using DataLab