- y
A multivariate continuous response, specified as an N x T matrix, where N
denotes the number of functions and T
, the number of time points per function. Intermittent
missing-at-random values are allowed and will be estimated from the posterior
predictive distribution. Missing cells should be denoted with NA
. The sampling of
missing values requires co-sampling the functions, bb
, while these functions are
marginalized out for sampling under the case of no missing data. So the sampler will run
more slowly in the case of intermittent missingness.
- ipr
An optional input vector of inclusion probabilities for each observation unit in the case
the observed data were acquired through an informative sampling design, so that unbiased
inference about the population requires adjustments to the observed sample. Defaults to
ipr = rep(1,nrow(y))
indicating an iid sample.
- time_points
Inputs a vector of common time points at which the collections of functions were
observed (with the possibility of intermittent missingness). The length of time_points
should be equal to the number of columns in the data matrix, y
. Defaults to
time_points = 1:ncol(y)
.
- gp_cov
A vector of length L
to select the covariance function for each of
L
terms. Allowed inputs are c("rq","se","sn")
, where "rq" denotes the
rational quadratic covariance function, "se", the square exponential, and "sn", seasonality.
The seasonality covariance is a quasi-periodic multiplicative combination of a fixed length-
scale periodic covariance kernel (where the period length is specified in 'sn_order') and
a squared exponential kernel (of varying length-scale) to allow the periodicity in the seasonal
term to evolve. e.g. If 4 terms are wanted - 2 trend terms with "se" and 2 seasonality terms, the input would be
gp_cov = c("se","se","sn","sn")
. Defaults to gp_cov = "rq"
.
- sn_order
A vector of length L_s
, the number of terms in gp_cov
where "sn"
is selected, that denotes the seasonality order for each term; e.g. if the two "sn" terms above
are for 3 and 12 month seasonality, respectively, for monthly data, then
sn_order = c(3,12)
. Defaults to sn_order = NULL
.
- jitter
A scalar numerical value added to the diagonal elements of the T x T GP covariance
matrix to stabilize computation. Defaults to jitter = 0.01
.
- gp_shape
The shape parameter of the Gamma base distribution for the DP prior on
the P x N matrix of GP covariance parameters (where P
denotes the number of parameters for each of the N experimental units).
Defaults to gp_shape = 1
- gp_rate
The rate parameter of the Gamma base distribution on GP covariance parameters.
Defaults to gp_rate = 1
.
- noise_shape
The shape parameter of the Gamma base distribution on tau_e
, the
model noise precision parameter. Defaults to noise_shape = 3
.
- noise_rate
The rate parameter of the Gamma base distribution on tau_e
, the model
noise precision parameter. Defaults to noise_rate = 1
.
- dp_shape
The shape parameter for the Gamma prior on the DP concentration parameter,
conc
. Defaults to dp_shape = 1
.
- dp_rate
The rate parameter for the Gamma prior on the DP concentration parameter,
conc
. Defaults to dp_rate = 1
.
- M_init
Starting number of clusters of nrow(y)
units to initialize sampler.
Defaults to M_init = nrow(y)
.
- lower
The lower end of the range to be used in conditionally sampling the GP covariance
parameters (kappa,tau_e
) in the slice sampler. Defaults to lower = 0
.
- upper
The upper end of the range to be used in conditionally sampling the GP covariance
parameters (kappa,tau_e
) in the slice sampler. Defaults to upper = 1e10
.
- sub_size
Integer vector whose length, n
, equals the number of progressively coarser
GP covariance matrices to use for tempered sampling steps in an alternative space to sample the
GP covariance parameters. Each entry denotes the number of sub-sample time points in T
to draw under a latin hypercube design from the N x T data matrix, y
to employ
for that distribution; for example, suppose T = 300
. sub_size = c(100,50)
,
would randomly select first 100
and then 50
time points from y
to use
for each of the n = 2
distributions. Defaults to
sub_size = c(floor(0.25*T),floor(0.1*T))
.
- w_star
Integer value denoting the number of cluster locations to sample ahead of
observations in the auxiliary Gibbs sampler used to sample the number of clusters
and associated cluster assignments. A higher value reduces samplin auto-correlation, but
increases computational burden. Defaults to w_star = 2
.
- w
Numeric value denoting the step width used to construct the interval from
which to draw a sample for each GP covariance parameter in the slice sampler. This
value is adaptively updated in the sampler tuning stage for each parameter to be equal
to the difference in the 0.95 and 0.05 sample quantiles for each of 5 block updates.
Defaults to w = 1.5
.
- n.iter
Total number of MCMC iterations.
- n.burn
Number of MCMC iterations to discard.
gpdpgrow
will return (n.iter - n.burn)
posterior samples.
- n.thin
Gap between successive sampling iterations to save.
- n.tune
Number of iterations (before ergodic chain instantiated) to adapt w
, separately,
for each covariance term, p = 1,...,P
. Sets each w_p
to lie in the 90 percent credible
interval computed from the tuning sample (that is divided into 5 blocks so that w_p
is
successively updated in each block of runs).
- progress
A boolean value denoting whether to display a progress bar during model execution.
Defaults to progress = TRUE
- b_move
A boolean value denoting whether to sample the GP function, bb
, in T x 1
Gibbs steps b_move = TRUE
or through elliptical slice sampling.
Defaults to b_move = TRUE
. Only used in the case there is any intermittent missingness;
otherwise bb
is marginalized out of the sampler and post-sampled from the its predictive
distribution.
- cluster
A boolean value denoting whether to employ DP mix model over set of GP functions or
to just use GP model with no clustering of covariance function parameters.
Defaults to cluster = TRUE
- s
An N x 1 integer vector that inputs a fixed clustering, rather than sampling it.
Defaults to s = NULL