- formula
A formula object, corresponding to the spatial factor analysis model. The response must be on the left of a ~ operator, and the terms on the right
must indicate the covariates to be included in the fixed effects. If no covariates are desired a zero should be used, ~ 0.
- data
A required data.frame containing the variables in the model. The data frame must contain M x O x Nu rows.
Here, M represents the number of spatial locations, O the number of different observation types
and Nu the number of temporal visits. The observations must be first be
ordered spatially, second by observation type and then temporally. This means that the first M x O observations come from the first time point and
the first M observations come the first spatial observation type.
- dist
A M x M dimensional distance matrix. For a discrete spatial process the matrix contains binary adjacencies that dictate the
spatial neighborhood structure and for continuous spatial processes the matrix should be a continuous distance matrix (e.g., Euclidean).
- time
A Nu dimensional vector containing the observed time points
in increasing order.
- K
A scalar that indicates the dimension (i.e., quantity) of latent factors.
- L
The number of latent clusters. If finite, a scalar indicating the number of clusters for each column of the factor loadings matrix. By default L is set at Inf
so that the Probit stick-breaking process becomes an infinite mixture model.
- trials
A variable in data that contains the number of trials for each of the binomial observations. If there is no count data, trials should be left missing.
- family
Character string indicating the distribution of the observed data. Options
include: "normal", "probit", "tobit", and "binomial". family must have either O or
1 dimension(s) (the one populates the rest). Any combination of likelihoods can be used.
- temporal.structure
Character string indicating the temporal kernel. Options include:
"exponential" and "ar1".
- spatial.structure
Character string indicating the type of spatial process. Options include:
"continuous" (i.e., Gaussian process with exponential kernel) and "discrete" (i.e., proper CAR).
- starting
Either NULL or a list containing starting values
to be specified for the MCMC sampler. If NULL is not chosen then none, some or all
of the starting values may be specified.
When NULL is chosen then default starting values are automatically generated.
Otherwise a list must be provided with names Beta, Delta, Sigma2, Kappa, Rho, Upsilon or
Psi containing appropriate objects. Beta (or Delta) must either be a P (or K) dimensional
vector or a scalar (the scalar populates the entire vector). Sigma2 must be either a M x (O - C) matrix or a scalar.
Kappa must be a O x O dimensional matrix, Rho a scalar, Upsilon a K x K matrix, and Psi a scalar.
- hypers
Either NULL or a list containing hyperparameter values
to be specified for the MCMC sampler. If NULL is not chosen then none, some or all
of the hyperparameter values may be specified.
When NULL is chosen then default hyperparameter values are automatically
generated. These default hyperparameters are described in detail in (Berchuck et al.).
Otherwise a list must be provided with names Beta, Delta, Sigma2, Kappa, Rho, Upsilon or
Psi containing further hyperparameter information. These objects are themselves
lists and may be constructed as follows.
Beta is a list with two objects, MuBeta and SigmaBeta. These values represent the prior mean and variance
parameters for the multivariate normal prior.
Delta is a list with two objects, A1 and A2. These values represent the prior shape
parameters for the multiplicative Gamma shrinkage prior.
Sigma2 is a list with two objects, A and B. These values represent the shape and scale for the variance parameters.
Kappa is a list with two objects,
SmallUpsilon and BigTheta. SmallUpsilon represents the degrees of freedom parameter for the
inverse-Wishart hyperprior and must be a real number scalar, while BigTheta represents
the scale matrix and must be a O x O dimensional positive definite matrix.
Rho is a list with two objects, ARho and BRho. ARho
represents the lower bound for the uniform hyperprior, while BRho represents
the upper bound. The bounds must be specified carefully. This is only specified for continuous spatial processes.
Upsilon is a list with two objects,
Zeta and Omega. Zeta represents the degrees of freedom parameter for the
inverse-Wishart hyperprior and must be a real number scalar, while Omega represents
the scale matrix and must be a K x K dimensional positive definite matrix.
Psi is a list with two objects, dependent on if the temporal kernel is exponential or ar1.
For exponential, the two objects are APsi and BPsi. APsi
represents the lower bound for the uniform hyperprior, while BPsi represents
the upper bound. The bounds must be specified carefully. For ar1, the two objects are Beta and Gamma, which are the
two shape parameters of a Beta distribution shifted to have domain in (-1, 1).
- tuning
Either NULL or a list containing tuning values
to be specified for the MCMC Metropolis steps. If NULL is not chosen then all
of the tuning values must be specified.
When NULL is chosen then default tuning values are automatically generated to
1. Otherwise a list must be provided with names Psi,
or Rho. Each of these entries must be scalars containing tuning variances for their corresponding Metropolis updates.
Rho is only specified for continuous spatial processes.
- mcmc
Either NULL or a list containing input values to be used
for implementing the MCMC sampler. If NULL is not chosen then all
of the MCMC input values must be specified.
NBurn: The number of sampler scans included in the burn-in phase. (default =
10,000)
NSims: The number of post-burn-in scans for which to perform the
sampler. (default = 10,000)
NThin: Value such that during the post-burn-in phase, only every
NThin-th scan is recorded for use in posterior inference (For return values
we define, NKeep = NSims / NThin (default = 1).
NPilot: The number of times during the burn-in phase that pilot adaptation
is performed (default = 20)
- seed
An integer value used to set the seed for the random number generator
(default = 54).
- gamma.shrinkage
A logical indicating whether a gamma shrinkage process prior is used for the variances of the factor loadings columns. If FALSE,
the hyperparameters (A1 and A2) indicate the shape and rate for a gamma prior on the precisions. Default is TRUE.
- include.space
A logical indicating whether a spatial process should be included. Default is TRUE, however if FALSE the spatial correlation matrix
is fixed as an identity matrix. This specification overrides the spatial.structure input.
- clustering
A logical indicating whether the Bayesian non-parametric process should be used, default is TRUE. If FALSE is specified
each column is instead modeled with an independent spatial process.