- formula
a symbolic description of the model to be fit
for the model using R's model syntax. Only
right-hand side of formula is specified. See example below. Random intercepts
and slopes are allowed using lme4 syntax (Bates et al. 2015).
- data
a list containing data necessary for model fitting.
Valid tags are y
, covs
, z
, coords
, and offset
. y
is a vector, matrix, or data frame of the observed count values. If a vector,
the values represent the observed counts at each site. If multiple replicate
observations are obtained at the sites (e.g., sub-samples, repeated sampling over
multiple seasons), y
can be specified as a matrix or data frame
with first dimension equal to the number of
sites (\(J\)) and second dimension equal to the maximum number of
replicates at a given site. covs
is either a data frame or list
containing the variables used in the model. When only fitting a model with site-level
data, covs
can be specified as a data frame, with each row corresponding to
site and each column corresponding to a variable. When multiple abundance values
are available at a site, covs
is specified as a list, where each list element is a different
covariate, which can be site-level or observation-level. Site-level covariates
are specified as a vector of length \(J\), while observation-level covariates
are specified as a matrix or data frame with the number of rows equal to \(J\)
and number of columns equal to the maximum number of replicate observations at a
given site. coords
is a \(J \times 2\) matrix of the observation coordinates.
Note that spAbundance
assumes coordinates are specified
in a projected coordinate system. For zero-inflated Gaussian models, the tag z
is
used to specify the binary component of the zero-inflated model and should have the same
length as y
. offset
is an offset to use in the abundance model (e.g., an area offset).
This can be either a single value, a vector with an offset for each site (e.g., if survey area differed in size), or a site x replicate matrix if more than one count is available at a given site.
- inits
a list with each tag corresponding to a parameter name.
Valid tags are beta
, sigma.sq
,
phi
, w
, nu
, kappa
, sigma.sq.mu
, tau.sq
.
nu
is only specified if cov.model = "matern"
, sigma.sq.mu
is only specified if there are random effects in formula
, and
kappa
is only specified when family = 'NB'
.
tau.sq
is only specified when family = 'Gaussian'
or family = 'zi-Gaussian'
.
The value portion of each tag is the parameter's initial value. See priors
description for definition of each parameter name.
Additionally, the tag fix
can be set to TRUE
to fix the starting values across all chains. If fix
is not specified
(the default), starting values are varied randomly across chains.
- priors
a list with each tag corresponding to a parameter name.
Valid tags are beta.normal
, phi.unif
,
sigma.sq.ig
, nu.unif
, kappa.unif
,
sigma.sq.mu.ig
, tau.sq.ig
. Abundance (beta
) regression coefficients
are assumed to follow a normal distribution. The hyperparameters of the
normal distribution are passed as a list of length two with the first
and second elements corresponding to the mean and variance of the normal
distribution, which are each specified as vectors of
length equal to the number of coefficients to be estimated or of length
one if priors are the same for all coefficients. If not
specified, prior means are set to 0 and prior variances are set to 100. The
spatial variance parameter, sigma.sq
, is assumed to follow an
inverse-Gamma distribution. The spatial decay phi
, spatial
smoothness nu
, and negative binomial dispersion kappa
parameters are assumed to follow Uniform
distributions. The hyperparameters of the inverse-Gamma for sigma.sq
are passed as a vector of length two, with the first and second
elements corresponding to the shape and scale, respectively.
The hyperparameters of the Uniform are also passed as a vector of
length two with the first and second elements corresponding to
the lower and upper support, respectively. sigma.sq.mu
are the random effect variances for any random effects, and are assumed to follow an
inverse-Gamma distribution. The hyperparameters of
the inverse-Gamma distribution are passed as a list of length two with the
first and second elements corresponding to the shape and scale parameters,
respectively, which are each specified as vectors of length equal to the
number of random effects or of length one if priors are the same for all
random effect variances. tau.sq
is the residual variance
for Gaussian (or zero-inflated Gaussian) models, and it is assigned
an inverse-Gamma prior. The hyperparameters of the inverse-Gamma are passed as a vector
of length two, with the first and second element corresponding to the shape and
scale parameters, respectively.
- cov.model
a quoted keyword that specifies the covariance
function used to model the spatial dependence structure among the
observations. Supported covariance model key words are:
"exponential"
, "matern"
, "spherical"
, and
"gaussian"
.
- tuning
a single numeric value representing the initial variance of the
adaptive sampler for beta
, alpha
, beta.star
(the abundance
random effect values), kappa
, phi
,
and nu
. See Roberts and Rosenthal (2009) for details. Note that only phi
and nu
are the only parameters that require tuning for a Gaussian or
zero-inflated Gaussian model.
- NNGP
if TRUE
, model is fit with an NNGP. See Datta et al. (2016) and
Finley et al. (2019) for more information. Currently only NNGP is supported,
functionality for a full GP may be addded in future package development.
- n.neighbors
number of neighbors used in the NNGP. Only used if
NNGP = TRUE
. Datta et al. (2016) showed that 15 neighbors is usually
sufficient, but that as few as 5 neighbors can be adequate for certain data
sets, which can lead to even greater decreases in run time. We recommend
starting with 15 neighbors (the default) and if additional gains in computation
time are desired, subsequently compare the results with a smaller number
of neighbors using WAIC.
- search.type
a quoted keyword that specifies the type of nearest
neighbor search algorithm. Supported method key words are: "cb"
and
"brute"
. The "cb"
should generally be much
faster. If locations do not have identical coordinate values on
the axis used for the nearest neighbor ordering then "cb"
and "brute"
should produce identical neighbor sets.
However, if there are identical coordinate values on the axis used
for nearest neighbor ordering, then "cb"
and "brute"
might produce different, but equally valid, neighbor sets,
e.g., if data are on a grid.
- n.batch
the number of MCMC batches in each chain to run for the adaptive MCMC
sampler. See Roberts and Rosenthal (2009) for details.
- batch.length
the length of each MCMC batch in each chain to run for the adaptive
MCMC sampler. See Roberts and Rosenthal (2009) for details.
- accept.rate
target acceptance rate for adaptive MCMC. Default is
0.43. See Roberts and Rosenthal (2009) for details.
- family
the distribution to use for the latent abundance process. Currently
supports 'NB'
(negative binomial), 'Poisson'
, 'Gaussian'
,
and 'zi-Gaussian'
.
- n.omp.threads
a positive integer indicating
the number of threads to use for SMP parallel processing. The package must
be compiled for OpenMP support. For most Intel-based machines, we
recommend setting n.omp.threads
up to the number of
hyperthreaded cores. Note, n.omp.threads
> 1 might not
work on some systems.
- verbose
if TRUE
, messages about data preparation,
model specification, and progress of the sampler are printed to the screen.
Otherwise, no messages are printed.
- n.report
the interval to report Metropolis sampler acceptance and
MCMC progress.
- n.burn
the number of samples out of the total n.batch * batch.length
samples in each chain to discard as burn-in. By default, the first
10% of samples is discarded.
- n.thin
the thinning interval for collection of MCMC samples. The
thinning occurs after the n.burn
samples are discarded. Default
value is set to 1.
- n.chains
the number of MCMC chains to run in sequence.
- save.fitted
logical value indicating whether or not fitted values and likelihood values
should be saved in the resulting model object. If save.fitted = FALSE
, the components
y.rep.samples
, mu.samples
, and like.samples
will not be included
in the model object, and subsequent functions for calculating WAIC, fitted values, and
posterior predictive checks will not work, although they all can be calculated manually if
desired. Setting save.fitted = FALSE
can be useful when working with very large
data sets to minimize the amount of RAM needed when fitting and storing the model object in
memory.
- ...
currently no additional arguments