cqo.qrrvglm.control(Rank = 1, Bestof = if (length(Cinit)) 1 else 10, checkwz = TRUE, Cinit = NULL, Crow1positive = TRUE, epsilon = 1.0e-06, EqualTolerances = NULL, eq.tolerances = TRUE, Etamat.colmax = 10, FastAlgorithm = TRUE, GradientFunction = TRUE, Hstep = 0.001, isd.latvar = rep_len(c(2, 1, rep_len(0.5, Rank)), Rank), iKvector = 0.1, iShape = 0.1, ITolerances = NULL, I.tolerances = FALSE, maxitl = 40, imethod = 1, Maxit.optim = 250, MUXfactor = rep_len(7, Rank), noRRR = ~ 1, Norrr = NA, optim.maxit = 20, Parscale = if (I.tolerances) 0.001 else 1.0, sd.Cinit = 0.02, SmallNo = 5.0e-13, trace = TRUE, Use.Init.Poisson.QO = TRUE, wzepsilon = .Machine$double.eps^0.75, ...)noRRR argument, and the rest
of the terms comprise $x_2$.
Bestof models fitted is returned.
This argument helps guard against local solutions by (hopefully)
finding the global solution from many fits. The argument has value
1 if an initial value for $C$ is inputted using Cinit.
wzepsilon. If not,
any values less than wzepsilon are replaced with this value.
.Init.Poisson.QO() to obtain
initial values.
Rank (recycled if necessary): are
the elements of the first row of $C$ positive? For example,
if Rank is 4, then specifying Crow1positive = c(FALSE,
TRUE) will force $C[1,1]$ and $C[1,3]$ to be negative,
and $C[1,2]$ and $C[1,4]$ to be positive. This argument
allows for a reflection in the ordination axes because the
coefficients of the latent variables are unique up to a sign.
eq.tolerances = TRUE
can help avoid numerical problems, especially with binary data.
Note that the estimated (common) tolerance matrix may or may
not be positive-definite. If it is then it can be scaled to
the $R$ by $R$ identity matrix, i.e., made equivalent
to I.tolerances = TRUE. Setting I.tolerances = TRUE
will force a common $R$ by $R$ identity matrix as
the tolerance matrix to the data even if it is not appropriate.
In general, setting I.tolerances = TRUE is
preferred over eq.tolerances = TRUE because,
if it works, it is much faster and uses less memory.
However, I.tolerances = TRUE requires the
environmental variables to be scaled appropriately.
See Details for more details.
eq.tolerances instead.
Rank. Controls the amount
of memory used by .Init.Poisson.QO(). It is the maximum
number of columns allowed for the pseudo-response and its weights.
In general, the larger the value, the better the initial value.
Used only if Use.Init.Poisson.QO = TRUE.
FastAlgorithm = FALSE will give an error.
optim's argument gr
is used or not, i.e., to compute gradient values. Used only if
FastAlgorithm is TRUE. The default value is usually
faster on most problems.
optim.
I.tolerances = TRUE. Used by
.Init.Poisson.QO() to obtain initial values for the constrained
coefficients $C$ adjusted to a reasonable value. It adjusts the
spread of the site scores relative to a common species tolerance of 1
for each ordination axis. A value between 0.5 and 10 is recommended;
a value such as 10 means that the range of the environmental space is
very large relative to the niche width of the species. The successive
values should decrease because the first ordination axis should have
the most spread of site scores, followed by the second ordination
axis, etc.
negbinomial and gamma2.
These arguments override the ik and ishape
arguments in negbinomial and gamma2.
TRUE then the (common) tolerance matrix is the
$R$ by $R$ identity matrix by definition. Note that having
I.tolerances = TRUE implies eq.tolerances = TRUE, but
not vice versa. Internally, the quadratic terms will be treated as
offsets (in GLM jargon) and so the models can potentially be fitted
very efficiently. However, it is a very good idea to center
and scale all numerical variables in the $x_2$ vector.
See Details for more details.
The success of I.tolerances = TRUE often
depends on suitable values for isd.latvar and/or
MUXfactor.
I.tolerances instead.
negbinomial and
gamma2 only, and used within the C.
optim at each of the optim.maxit
iterations.
I.tolerances = TRUE. Offsets are $-0.5$
multiplied by the sum of the squares of all $R$ latent variable
values. If the latent variable values are too large then this will
result in numerical problems. By too large, it is meant that the
standard deviation of the latent variable values are greater than
MUXfactor[r] * isd.latvar[r] for r=1:Rank (this is why
centering and scaling all the numerical predictor variables in
$x_2$ is recommended). A value about 3 or 4 is recommended.
If failure to converge occurs, try a slightly lower value.
noRRR.
Use of Norrr will become an error soon.
optim(..., control = list(parscale = Parscale));
the elements of $C$ become $C$ / Parscale.
Setting I.tolerances = TRUE results in line searches that
are very large, therefore $C$ has to be scaled accordingly
to avoid large step sizes.
See Details for more information.
It's probably best to leave this argument alone.
Use.Init.Poisson.QO = FALSE
and $C$ is not inputted using Cinit.
TRUE because the
calculations are numerically intensive, meaning it may take
a long time, so that the user might think the computer has
locked up if trace = FALSE..Machine$double.eps and 0.0001.
Used to avoid under- or over-flow in the IRLS algorithm.
Used only if FastAlgorithm is TRUE.
TRUE then the function .Init.Poisson.QO() is
used to obtain initial values for the canonical coefficients $C$.
If FALSE then random numbers are used instead.
Bestof is a bare minimum for many datasets,
therefore it will be necessary to increase its value to increase the
chances of obtaining the global solution. Having I.tolerances = TRUE means all the tolerance matrices
are the order-$R$ identity matrix, i.e., it forces
bell-shaped curves/surfaces on all species. This results in a
more difficult optimization problem (especially for 2-parameter
models such as the negative binomial and gamma) because of overflow
errors and it appears there are more local solutions. To help avoid
the overflow errors, scaling $C$ by the factor Parscale
can help enormously. Even better, scaling $C$ by specifying
isd.latvar is more understandable to humans. If failure to
converge occurs, try adjusting Parscale, or better, setting
eq.tolerances = TRUE (and hope that the estimated tolerance
matrix is positive-definite). To fit an equal-tolerances model, it
is firstly best to try setting I.tolerances = TRUE and varying
isd.latvar and/or MUXfactor if it fails to converge.
If it still fails to converge after many attempts, try setting
eq.tolerances = TRUE, however this will usually be a lot slower
because it requires a lot more memory.
With a $R > 1$ model, the latent variables are always uncorrelated, i.e., the variance-covariance matrix of the site scores is a diagonal matrix.
If setting eq.tolerances = TRUE is used and the common
estimated tolerance matrix is positive-definite then that model is
effectively the same as the I.tolerances = TRUE model (the two are
transformations of each other). In general, I.tolerances = TRUE
is numerically more unstable and presents a more difficult problem
to optimize; the arguments isd.latvar and/or MUXfactor often
must be assigned some good value(s) (possibly found by trial and error)
in order for convergence to occur. Setting I.tolerances = TRUE
forces a bell-shaped curve or surface onto all the species data,
therefore this option should be used with deliberation. If unsuitable,
the resulting fit may be very misleading. Usually it is a good idea
for the user to set eq.tolerances = FALSE to see which species
appear to have a bell-shaped curve or surface. Improvements to the
fit can often be achieved using transformations, e.g., nitrogen
concentration to log nitrogen concentration.
Fitting a CAO model (see cao) first is a good idea for
pre-examining the data and checking whether it is appropriate to fit
a CQO model.
Yee, T. W. (2006) Constrained additive ordination. Ecology, 87, 203--213.
cqo,
rcqo,
Coef.qrrvglm,
Coef.qrrvglm-class,
optim,
binomialff,
poissonff,
negbinomial,
gamma2,
gaussianff.## Not run: # Poisson CQO with equal tolerances
# set.seed(111) # This leads to the global solution
# hspider[,1:6] <- scale(hspider[,1:6]) # Good idea when I.tolerances = TRUE
# p1 <- cqo(cbind(Alopacce, Alopcune, Alopfabr, Arctlute, Arctperi, Auloalbi,
# Pardlugu, Pardmont, Pardnigr, Pardpull, Trocterr, Zoraspin) ~
# WaterCon + BareSand + FallTwig + CoveMoss + CoveHerb + ReflLux,
# quasipoissonff, data = hspider, eq.tolerances = TRUE)
# sort(deviance(p1, history = TRUE)) # A history of all the iterations
#
# (isd.latvar <- apply(latvar(p1), 2, sd)) # Should be approx isd.latvar
#
# # Refit the model with better initial values
# set.seed(111) # This leads to the global solution
# p1 <- cqo(cbind(Alopacce, Alopcune, Alopfabr, Arctlute, Arctperi, Auloalbi,
# Pardlugu, Pardmont, Pardnigr, Pardpull, Trocterr, Zoraspin) ~
# WaterCon + BareSand + FallTwig + CoveMoss + CoveHerb + ReflLux,
# I.tolerances = TRUE, quasipoissonff, data = hspider,
# isd.latvar = isd.latvar) # Note the use of isd.latvar here
# sort(deviance(p1, history = TRUE)) # A history of all the iterations
# ## End(Not run)
Run the code above in your browser using DataLab