Algorithmic constants and parameters for a constrained quadratic
  ordination (CQO), by fitting a quadratic reduced-rank vector
  generalized linear model (QRR-VGLM), are set using this function.
  It is the control function for cqo.
qrrvglm.control(Rank = 1, Bestof = if (length(Cinit)) 1 else 10,
    checkwz = TRUE, Cinit = NULL, Crow1positive = TRUE,
    epsilon = 1.0e-06, EqualTolerances = NULL, eq.tolerances = TRUE,
    Etamat.colmax = 10, FastAlgorithm = TRUE, GradientFunction = TRUE,
    Hstep = 0.001, isd.latvar = rep_len(c(2, 1, rep_len(0.5, Rank)), Rank),
    iKvector = 0.1, iShape = 0.1, ITolerances = NULL, I.tolerances = FALSE,
    maxitl = 40, imethod = 1, Maxit.optim = 250, MUXfactor = rep_len(7, Rank),
    noRRR = ~ 1, Norrr = NA, optim.maxit = 20,
    Parscale = if (I.tolerances) 0.001 else 1.0,
    sd.Cinit = 0.02, SmallNo = 5.0e-13, trace = TRUE,
    Use.Init.Poisson.QO = TRUE, wzepsilon = .Machine$double.eps^0.75, ...)The numerical rank \(R\) of the model, i.e., the
    number of ordination axes. Must be an element from the set
    {1,2,…,min(\(M\),\(p_2\))} where the vector of explanatory
    variables \(x\) is partitioned into (\(x_1\),\(x_2\)), which is
    of dimension \(p_1+p_2\). The variables making up \(x_1\)
    are given by the terms in the noRRR argument, and the rest
    of the terms comprise \(x_2\).
Integer. The best of Bestof models fitted is returned.
    This argument helps guard against local solutions by (hopefully)
    finding the global solution from many fits. The argument has value
    1 if an initial value for \(C\) is inputted using Cinit.
logical indicating whether the diagonal elements of
    the working weight matrices should be checked whether they are
    sufficiently positive, i.e., greater than wzepsilon. If not,
    any values less than wzepsilon are replaced with this value.
Optional initial \(C\) matrix, which must be a \(p_2\) by \(R\)
    matrix. The default is to apply .Init.Poisson.QO() to obtain
    initial values.
Logical vector of length Rank (recycled if necessary): are
      the elements of the first row of \(C\) positive? For example,
      if Rank is 4, then specifying Crow1positive = c(FALSE,
      TRUE) will force \(C[1,1]\) and \(C[1,3]\) to be negative,
      and \(C[1,2]\) and \(C[1,4]\) to be positive. This argument
      allows for a reflection in the ordination axes because the
      coefficients of the latent variables are unique up to a sign.
Positive numeric. Used to test for convergence for GLMs fitted in C. Larger values mean a loosening of the convergence criterion. If an error code of 3 is reported, try increasing this value.
Logical indicating whether each (quadratic) predictor will
    have equal tolerances. Having eq.tolerances = TRUE
    can help avoid numerical problems, especially with binary data.
    Note that the estimated (common) tolerance matrix may or may
    not be positive-definite. If it is  then it can be scaled to
    the \(R\) by \(R\) identity matrix, i.e., made equivalent
    to I.tolerances = TRUE. Setting I.tolerances = TRUE
    will force a common \(R\) by \(R\) identity matrix as
    the tolerance matrix to the data even if it is not appropriate.
    In general, setting I.tolerances = TRUE is
    preferred over eq.tolerances = TRUE because,
    if it works, it is much faster and uses less memory.
    However, I.tolerances = TRUE requires the
    environmental variables to be scaled appropriately.
    See Details for more details.
Defunct argument.
    Use eq.tolerances instead.
Positive integer, no smaller than Rank. Controls the amount
    of memory used by .Init.Poisson.QO(). It is the maximum
    number of columns allowed for the pseudo-response and its weights.
    In general, the larger the value, the better the initial value.
    Used only if Use.Init.Poisson.QO = TRUE.
Logical. Whether a new fast algorithm is to be used. The fast
   algorithm results in a large speed increases compared to Yee
   (2004).
   Some details of the fast algorithm are found in Appendix A of Yee (2006).
   Setting FastAlgorithm = FALSE will give an error.
Logical. Whether optim's argument gr
   is used or not, i.e., to compute gradient values. Used only if
   FastAlgorithm is TRUE. The default value is usually
   faster on most problems.
Positive value. Used as the step size in the finite difference
   approximation to the derivatives by optim.
Initial standard deviations for the latent variables (site scores).
   Numeric, positive and of length \(R\) (recycled if necessary).
   This argument is used only if I.tolerances = TRUE. Used by
   .Init.Poisson.QO() to obtain initial values for the constrained
   coefficients \(C\) adjusted to a reasonable value. It adjusts the
   spread of the site scores relative to a common species tolerance of 1
   for each ordination axis. A value between 0.5 and 10 is recommended;
   a value such as 10 means that the range of the environmental space is
   very large relative to the niche width of the species. The successive
   values should decrease because the first ordination axis should have
   the most spread of site scores, followed by the second ordination
   axis, etc.
Numeric, recycled to length \(S\) if necessary.
   Initial values used for estimating the positive \(k\) and
   \(\lambda\) parameters of the negative binomial and
   2-parameter gamma distributions respectively. For further information
   see negbinomial and gamma2.
   These arguments override the ik and ishape
   arguments in negbinomial and gamma2.
Logical. If TRUE then the (common) tolerance matrix is the
   \(R\) by \(R\) identity matrix by definition. Note that having
   I.tolerances = TRUE implies eq.tolerances = TRUE, but
   not vice versa. Internally, the quadratic terms will be treated as
   offsets (in GLM jargon) and so the models can potentially be fitted
   very efficiently. However, it is a very good idea to center
   and scale all numerical variables in the \(x_2\) vector.
   See Details for more details.
   The success of I.tolerances = TRUE often
   depends on suitable values for isd.latvar and/or
   MUXfactor.
Defunct argument.
    Use I.tolerances instead.
Maximum number of times the optimizer is called or restarted. Most users should ignore this argument.
Method of initialization. A positive integer 1 or 2 or 3 etc.
    depending on the VGAM family function.
    Currently it is used for negbinomial and
    gamma2 only, and used within the C.
Positive integer. Number of iterations given to the function
    optim at each of the optim.maxit
    iterations.
Multiplication factor for detecting large offset values. Numeric,
   positive and of length \(R\) (recycled if necessary). This argument
   is used only if I.tolerances = TRUE. Offsets are \(-0.5\)
   multiplied by the sum of the squares of all \(R\) latent variable
   values. If the latent variable values are too large then this will
   result in numerical problems. By too large, it is meant that the
   standard deviation of the latent variable values are greater than
   MUXfactor[r] * isd.latvar[r] for r=1:Rank (this is why
   centering and scaling all the numerical predictor variables in
   \(x_2\) is recommended). A value about 3 or 4 is recommended.
   If failure to converge occurs, try a slightly lower value.
Formula giving terms that are not to be included in the reduced-rank regression (or formation of the latent variables), i.e., those belong to \(x_1\). Those variables which do not make up the latent variable (reduced-rank regression) correspond to the \(B_1\) matrix. The default is to omit the intercept term from the latent variables.
Defunct. Please use noRRR.
  Use of Norrr will become an error soon.
Numerical and positive-valued vector of length \(C\)
   (recycled if necessary).
   Passed into optim(..., control = list(parscale = Parscale));
   the elements of \(C\) become \(C\) / Parscale.
   Setting I.tolerances = TRUE results in line searches that
   are very large, therefore \(C\) has to be scaled accordingly
   to avoid large step sizes.
   See Details for more information.
   It's probably best to leave this argument alone.
Standard deviation of the initial values for the elements
      of \(C\).
      These are normally distributed with mean zero.
      This argument is used only if Use.Init.Poisson.QO = FALSE
      and \(C\) is not inputted using Cinit.
Logical indicating if output should be produced for
    each iteration. The default is TRUE because the
    calculations are numerically intensive, meaning it may take
    a long time, so that the user might think the computer has
    locked up if trace = FALSE.
Positive numeric between .Machine$double.eps and 0.0001.
  Used to avoid under- or over-flow in the IRLS algorithm.
  Used only if FastAlgorithm is TRUE.
Logical. If TRUE then the function .Init.Poisson.QO() is
    used to obtain initial values for the canonical coefficients \(C\).
    If FALSE then random numbers are used instead.
Small positive number used to test whether the diagonals of the working weight matrices are sufficiently positive.
Ignored at present.
A list with components matching the input names.
The default value of Bestof is a bare minimum for many datasets,
  therefore it will be necessary to increase its value to increase the
  chances of obtaining the global solution.
Recall that the central formula for CQO is $$\eta = B_1^T x_1 + A \nu + \sum_{m=1}^M (\nu^T D_m \nu) e_m$$ where \(x_1\) is a vector (usually just a 1 for an intercept), \(x_2\) is a vector of environmental variables, \(\nu=C^T x_2\) is a \(R\)-vector of latent variables, \(e_m\) is a vector of 0s but with a 1 in the \(m\)th position. QRR-VGLMs are an extension of RR-VGLMs and allow for maximum likelihood solutions to constrained quadratic ordination (CQO) models.
Having I.tolerances = TRUE means all the tolerance matrices
   are the order-\(R\) identity matrix, i.e., it forces
   bell-shaped curves/surfaces on all species. This results in a
   more difficult optimization problem (especially for 2-parameter
   models such as the negative binomial and gamma) because of overflow
   errors and it appears there are more local solutions. To help avoid
   the overflow errors, scaling \(C\) by the factor Parscale
   can help enormously. Even better, scaling \(C\) by specifying
   isd.latvar is more understandable to humans. If failure to
   converge occurs, try adjusting Parscale, or better, setting
   eq.tolerances = TRUE (and hope that the estimated tolerance
   matrix is positive-definite). To fit an equal-tolerances model, it
   is firstly best to try setting I.tolerances = TRUE and varying
   isd.latvar and/or MUXfactor if it fails to converge.
   If it still fails to converge after many attempts, try setting
   eq.tolerances = TRUE, however this will usually be a lot slower
   because it requires a lot more memory.
With a \(R > 1\) model, the latent variables are always uncorrelated, i.e., the variance-covariance matrix of the site scores is a diagonal matrix.
If setting eq.tolerances = TRUE is used and the common
   estimated tolerance matrix is positive-definite then that model is
   effectively the same as the I.tolerances = TRUE model (the two are
   transformations of each other). In general, I.tolerances = TRUE
   is numerically more unstable and presents a more difficult problem
   to optimize; the arguments isd.latvar and/or MUXfactor often
   must be assigned some good value(s) (possibly found by trial and error)
   in order for convergence to occur. Setting I.tolerances = TRUE
   forces a bell-shaped curve or surface onto all the species data,
   therefore this option should be used with deliberation. If unsuitable,
   the resulting fit may be very misleading. Usually it is a good idea
   for the user to set eq.tolerances = FALSE to see which species
   appear to have a bell-shaped curve or surface. Improvements to the
   fit can often be achieved using transformations, e.g., nitrogen
   concentration to log nitrogen concentration.
Fitting a CAO model (see cao) first is a good idea for
   pre-examining the data and checking whether it is appropriate to fit
   a CQO model.
Yee, T. W. (2004). A new technique for maximum-likelihood canonical Gaussian ordination. Ecological Monographs, 74, 685--701.
Yee, T. W. (2006). Constrained additive ordination. Ecology, 87, 203--213.
cqo,
  rcqo,
  Coef.qrrvglm,
  Coef.qrrvglm-class,
  optim,
  binomialff,
  poissonff,
  negbinomial,
  gamma2.
# NOT RUN {
 # Poisson CQO with equal tolerances
set.seed(111)  # This leads to the global solution
hspider[,1:6] <- scale(hspider[,1:6])  # Good idea when I.tolerances = TRUE
p1 <- cqo(cbind(Alopacce, Alopcune, Alopfabr, Arctlute, Arctperi, Auloalbi,
               Pardlugu, Pardmont, Pardnigr, Pardpull, Trocterr, Zoraspin) ~
          WaterCon + BareSand + FallTwig + CoveMoss + CoveHerb + ReflLux,
          poissonff, data = hspider, eq.tolerances = TRUE)
sort(deviance(p1, history = TRUE))  # A history of all the iterations
(isd.latvar <- apply(latvar(p1), 2, sd))  # Should be approx isd.latvar
# Refit the model with better initial values
set.seed(111)  # This leads to the global solution
p1 <- cqo(cbind(Alopacce, Alopcune, Alopfabr, Arctlute, Arctperi, Auloalbi,
               Pardlugu, Pardmont, Pardnigr, Pardpull, Trocterr, Zoraspin) ~
          WaterCon + BareSand + FallTwig + CoveMoss + CoveHerb + ReflLux,
          I.tolerances = TRUE, poissonff, data = hspider,
          isd.latvar = isd.latvar)  # Note the use of isd.latvar here
sort(deviance(p1, history = TRUE))  # A history of all the iterations
# }
Run the code above in your browser using DataLab