cao(formula, family, data = list(),
    weights = NULL, subset = NULL, na.action = na.fail,
    etastart = NULL, mustart = NULL, coefstart = NULL,
    control = cao.control(...), offset = NULL,
    method = "cao.fit", model = FALSE, x.arg = TRUE, y.arg = TRUE,
    contrasts = NULL, constraints = NULL,
    extra = NULL, qr.arg = FALSE, smart = TRUE, ...)"vglmff" (see vglmff-class)
  describing what statistical model is to be fitted. This is called a
  ``environment(formula),
    typically the environment from which cao is called.cao, this argument currently should
    not be used.NAs.  The default is set by the na.action setting of
    options, and is na.fail if that iscao,
    this argument currently should not be used.cao, this argument currently should not be used.cao, this
    argument currently should not be used.cao.control for details.cao, this argument currently should not be used.cao.fit uses iteratively
    reweighted least squares (IRLS) within FORTRAN code called from
    optimmodel slot.x and y slots.  Note the model matrix is the linear
    model (LM) matrix.contrasts.arg of
    model.matrix.default.cao, this
    argument currently should not be used.  The components of the list
    must be named with the term it corresponds to (and it must match
    in character format).  Each constraint macao, this argument currently should
    not be used.cao, this argument currently should not be used.smartpred) will be used.cao.control.
  Use set.seed just prior to calling
  cao() to make your results reproducible.
  The reason for this is finding the optimal
  CAO model presents a difficult optimization problem, partly because the
  log-likelihood function contains many local solutions. To obtain the
  (global) solution the user is advised to try many initial values.
  This can be done by setting Bestof some appropriate value
  (see cao.control). Trying many initial values becomes
  progressively more important as the nonlinear degrees of freedom of
  the smooths increase.
  Currently the dispersion parameter for a
  gaussianff CAO model is estimated slightly differently
  and may be slightly biassed downwards (usually a little too small).
cao are a mixture of those from
  vgam and cqo, but with some extras
  in cao.control. Currently, not all of the
  arguments work properly.
  CAO can be loosely be thought of as the result of fitting generalized
  additive models (GAMs) to several responses (e.g., species) against
  a very small number of latent variables.  Each latent variable is a
  linear combination of the explanatory variables; the coefficients
  C (called $C$ below) are called constrained
  coefficients or canonical coefficients, and are interpreted as
  weights or loadings. The C are estimated by maximum likelihood
  estimation.  It is often a good idea to apply scale
  to each explanatory variable first.
  For each response (e.g., species), each latent variable is smoothed
  by a cubic smoothing spline, thus CAO is data-driven. If each smooth
  were a quadratic then CAO would simplify to constrained quadratic
  ordination (CQO; formerly called canonical Gaussian ordination
  or CGO).
  If each smooth were linear then CAO would simplify to constrained
  linear ordination (CLO). CLO can theoretically be fitted with
  cao by specifying df1.nl=0, however it is more efficient
  to use rrvglm.
  Currently, only Rank=1 is implemented, and only
  noRRR = ~1 models are handled.
  With binomial data, the default formula is
  $$logit(P[Y_s=1]) = \eta_s = f_s(\nu), \ \ \ s=1,2,\ldots,S$$
  where $x_2$ is a vector of environmental variables, and
  $\nu=C^T x_2$ is a $R$-vector of latent variables.
  The $\eta_s$ is an additive predictor for species $s$,
  and it models the probabilities of presence as an additive model on
  the logit scale.  The matrix $C$ is estimated from the data, as
  well as the smooth functions $f_s$.  The argument noRRR = ~
  1 specifies that the vector $x_1$, defined for RR-VGLMs
  and QRR-VGLMs, is simply a 1 for an intercept.
  Here, the intercept in the model is absorbed into the functions.
  A cloglog link may be preferable over a
  logit link.
With Poisson count data, the formula is $$\log(E[Y_s]) = \eta_s = f_s(\nu)$$ which models the mean response as an additive models on the log scale.
  The fitted latent variables (site scores) are scaled to have
  unit variance.  The concept of a tolerance is undefined for
  CAO models, but the optima and maxima are defined. The generic
  functions Max and Opt should work for
  CAO objects, but note that if the maximum occurs at the boundary then
  Max will return a NA.  Inference for CAO models
  is currently undeveloped.
  Documentation accompanying the 
cao.control,
  Coef.cao,
  cqo,
  latvar,
  Opt,
  Max,
  persp.cao,
  poissonff,
  binomialff,
  negbinomial,
  gamma2,
  gaussianff,
  set.seed,
  gam,
  trapO.hspider[, 1:6] <- scale(hspider[, 1:6])  # Standardized environmental vars
set.seed(149)  # For reproducible results 
ap1 <- cao(cbind(Pardlugu, Pardmont, Pardnigr, Pardpull) ~
           WaterCon + BareSand + FallTwig + CoveMoss + CoveHerb + ReflLux,
           family = poissonff, data = hspider, Rank = 1,
           df1.nl = c(Pardpull= 2.7, 2.5),
           Bestof = 7, Crow1positive = FALSE)
sort(ap1@misc$deviance.Bestof)  # A history of all the iterations
Coef(ap1)
concoef(ap1)
par(mfrow = c(2, 2))
plot(ap1)  # All the curves are unimodal; some quite symmetric
par(mfrow = c(1, 1), las = 1)
index <- 1:ncol(depvar(ap1))
lvplot(ap1, lcol = index, pcol = index, y = TRUE)
trplot(ap1, label = TRUE, col = index)
abline(a = 0, b = 1, lty = 2)
trplot(ap1, label = TRUE, col = "blue", log = "xy", which.sp = c(1, 3))
abline(a = 0, b = 1, lty = 2)
persp(ap1, col = index, lwd = 2, label = TRUE)
abline(v = Opt(ap1), lty = 2, col = index)
abline(h = Max(ap1), lty = 2, col = index)Run the code above in your browser using DataLab