- nbAnimals
Number of observed individuals to simulate.
- nbStates
Number of behavioural states to simulate.
- dist
A named list indicating the probability distributions of the data streams. Currently
supported distributions are 'bern', 'beta', 'cat', 'exp', 'gamma', 'lnorm', 'logis', 'negbinom', 'norm', 'mvnorm2' (bivariate normal distribution), 'mvnorm3' (trivariate normal distribution),
'pois', 'rw_norm' (normal random walk), 'rw_mvnorm2' (bivariate normal random walk), 'rw_mvnorm3' (trivariate normal random walk), 'vm', 'vmConsensus', 'weibull', and 'wrpcauchy'. For example,
dist=list(step='gamma', angle='vm', dives='pois')
indicates 3 data streams ('step', 'angle', and 'dives')
and their respective probability distributions ('gamma', 'vm', and 'pois').
- Par
A named list containing vectors of initial state-dependent probability distribution parameters for
each data stream specified in dist
. The parameters should be in the order expected by the pdfs of dist
,
and any zero-mass and/or one-mass parameters should be the last (if both are present, then zero-mass parameters must preceed one-mass parameters).
If DM
is not specified for a given data stream, then Par
is on the natural (i.e., real) scale of the parameters. However, if DM
is specified for a given data stream, then
Par
must be on the working (i.e., beta) scale of the parameters, and the length of Par
must match the number
of columns in the design matrix. See details below.
- beta
Matrix of regression parameters for the transition probabilities (more information
in "Details").
- delta
Initial value for the initial distribution of the HMM. Default: rep(1/nbStates,nbStates)
. If formulaDelta
includes a formula, then delta
must be specified
as a k x (nbStates
-1) matrix, where k is the number of covariates and the columns correspond to states 2:nbStates
. See details below.
- formula
Regression formula for the transition probability covariates. Default: ~1
(no covariate effect). In addition to allowing standard functions in R formulas
(e.g., cos(cov)
, cov1*cov2
, I(cov^2)
), special functions include cosinor(cov,period)
for modeling cyclical patterns, spline functions
(bs
, ns
, bSpline
, cSpline
, iSpline
, and mSpline
),
and state- or parameter-specific formulas (see details).
Any formula terms that are not state- or parameter-specific are included on all of the transition probabilities.
- formulaDelta
Regression formula for the initial distribution. Default: NULL
(no covariate effects and delta
is specified on the real scale). Standard functions in R formulas are allowed (e.g., cos(cov)
, cov1*cov2
, I(cov^2)
). When any formula is provided, then delta
must be specified on the working scale.
- mixtures
Number of mixtures for the state transition probabilities (i.e. discrete random effects *sensu* DeRuiter et al. 2017). Default: mixtures=1
.
- formulaPi
Regression formula for the mixture distribution probabilities. Default: NULL
(no covariate effects; both beta$pi
and fixPar$pi
are specified on the real scale). Standard functions in R formulas are allowed (e.g., cos(cov)
, cov1*cov2
, I(cov^2)
). When any formula is provided, then both beta$pi
and fixPar$pi
are specified on the working scale.
Note that only the covariate values corresponding to the first time step for each individual ID are used (i.e. time-varying covariates cannot be used for the mixture probabilties).
- covs
Covariate values to include in the simulated data, as a dataframe. The names of any covariates specified by covs
can
be included in formula
and/or DM
. Covariates can also be simulated according to a standard normal distribution, by setting
covs
to NULL
(the default), and specifying nbCovs>0
.
- nbCovs
Number of covariates to simulate (0 by default). Does not need to be specified if
covs
is specified. Simulated covariates are provided generic names (e.g., 'cov1' and 'cov2' for nbCovs=2
) and can be included in formula
and/or DM
.
- spatialCovs
List of raster
objects for spatio-temporally referenced covariates. Covariates specified by spatialCovs
are extracted from the raster
layer(s) based on any simulated location data (and the z values for a raster stack
or brick
) for each time step. If an element of spatialCovs
is a raster stack
or brick
,
then z values must be set using raster::setZ
and covs
must include column(s) of the corresponding z value(s) for each observation (e.g., 'time').
The names of the raster layer(s) can be included in
formula
and/or DM
. Note that simData
usually takes longer to generate simulated data when spatialCovs
is specified.
- zeroInflation
A named list of logicals indicating whether the probability distributions of the data streams should be zero-inflated. If zeroInflation
is TRUE
for a given data stream, then values for the zero-mass parameters should be
included in the corresponding element of Par
.
- oneInflation
A named list of logicals indicating whether the probability distributions of the data streams should be one-inflated. If oneInflation
is TRUE
for a given data stream, then values for the one-mass parameters should be
included in the corresponding element of Par
.
- circularAngleMean
An optional named list indicating whether to use circular-linear (FALSE) or circular-circular (TRUE)
regression on the mean of circular distributions ('vm' and 'wrpcauchy') for turning angles. For example,
circularAngleMean=list(angle=TRUE)
indicates the angle mean is be estimated for 'angle' using circular-circular
regression. Whenever circular-circular regression is used for an angular data stream, a corresponding design matrix (DM
)
must be specified for the data stream, and the previous movement direction (i.e., a turning angle of zero) is automatically used
as the reference angle (i.e., the intercept). Default is NULL
, which assumes circular-linear regression is
used for any angular distributions. Any circularAngleMean
elements
corresponding to data streams that do not have angular distributions are ignored.
circularAngleMean
is also ignored for any 'vmConsensus' data streams (because the consensus model is a circular-circular regression model).
Alternatively, circularAngleMean
can be specified as a numeric scalar, where the value specifies the coefficient for the reference angle (i.e., directional persistence) term in the circular-circular regression model. For example, setting circularAngleMean
to 0
specifies a
circular-circular regression model with no directional persistence term (thus specifying a biased random walk instead of a biased correlated random walk). Setting circularAngleMean
to 1 is equivalent to setting it to TRUE, i.e., a circular-circular regression model with a coefficient of 1 for the directional persistence reference angle.
- centers
2-column matrix providing the x-coordinates (column 1) and y-coordinates (column 2) for any activity centers (e.g., potential
centers of attraction or repulsion) from which distance and angle covariates will be calculated based on the simulated location data. These distance and angle
covariates can be included in formula
and DM
using the row names of centers
. If no row names are provided, then generic names are generated
for the distance and angle covariates (e.g., 'center1.dist', 'center1.angle', 'center2.dist', 'center2.angle'); otherwise the covariate names are derived from the row names
of centers
as paste0(rep(rownames(centers),each=2),c(".dist",".angle"))
. Note that the angle covariates for each activity center are calculated relative to
the previous movement direction instead of standard directions relative to the x-axis; this is to allow turning angles to be simulated as a function of these covariates using circular-circular regression.
- centroids
List where each element is a data frame consisting of at least max(unlist(obsPerAnimal))
rows that provides the x-coordinates ('x') and y-coordinates ('y) for centroids (i.e., dynamic activity centers where the coordinates can change for each time step)
from which distance and angle covariates will be calculated based on the simulated location data. These distance and angle
covariates can be included in formula
and DM
using the names of centroids
. If no list names are provided, then generic names are generated
for the distance and angle covariates (e.g., 'centroid1.dist', 'centroid1.angle', 'centroid2.dist', 'centroid2.angle'); otherwise the covariate names are derived from the list names
of centroids
as paste0(rep(names(centroids),each=2),c(".dist",".angle"))
. Note that the angle covariates for each centroid are calculated relative to
the previous movement direction instead of standard directions relative to the x-axis; this is to allow turning angles to be simulated as a function of these covariates using circular-circular regression.
- angleCovs
Character vector indicating the names of any circular-circular regression angular covariates in covs
or spatialCovs
that need conversion from standard direction (in radians relative to the x-axis) to turning angle (relative to previous movement direction)
using circAngles
.
- obsPerAnimal
Either the number of observations per animal (if single value) or the bounds of the number of observations per animal (if vector of two values). In the latter case,
the numbers of obervations generated for each animal are uniformously picked from this interval. Alternatively, obsPerAnimal
can be specified as
a list of length nbAnimals
with each element providing the number of observations (if single value) or the bounds (if vector of two values) for each individual.
Default: c(500,1500)
.
- initialPosition
2-vector providing the x- and y-coordinates of the initial position for all animals. Alternatively, initialPosition
can be specified as
a list of length nbAnimals
with each element a 2-vector providing the x- and y-coordinates of the initial position for each individual.
Default: c(0,0)
. If mvnCoord
corresponds to a data stream with ``mvnorm3'' or ''rw_mvnorm3'' probability distributions, then initialPosition
must be composed of 3-vector(s) for the x-, y-, and z-coordinates.
- DM
An optional named list indicating the design matrices to be used for the probability distribution parameters of each data
stream. Each element of DM
can either be a named list of regression formulas or a ``pseudo'' design matrix. For example, for a 2-state
model using the gamma distribution for a data stream named 'step', DM=list(step=list(mean=~cov1, sd=~1))
specifies the mean
parameters as a function of the covariate 'cov1' for each state. This model could equivalently be specified as a 4x6 ``pseudo'' design matrix using
character strings for the covariate:
DM=list(step=matrix(c(1,0,0,0,'cov1',0,0,0,0,1,0,0,0,'cov1',0,0,0,0,1,0,0,0,0,1),4,6))
where the 4 rows correspond to the state-dependent paramaters (mean_1,mean_2,sd_1,sd_2) and the 6 columns correspond to the regression
coefficients.
Design matrices specified using formulas allow standard functions in R formulas
(e.g., cos(cov)
, cov1*cov2
, I(cov^2)
). Special formula functions include cosinor(cov,period)
for modeling cyclical patterns, spline functions
(bs
, ns
, bSpline
, cSpline
, iSpline
, and mSpline
),
angleFormula(cov,strength,by)
for the angle mean of circular-circular regression models, and state-specific formulas (see details). Any formula terms that are not state-specific are included on the parameters for all nbStates
states.
- userBounds
An optional named list of 2-column matrices specifying bounds on the natural (i.e, real) scale of the probability
distribution parameters for each data stream. For example, for a 2-state model using the wrapped Cauchy ('wrpcauchy') distribution for
a data stream named 'angle' with estAngleMean$angle=TRUE)
, userBounds=list(angle=matrix(c(-pi,-pi,-1,-1,pi,pi,1,1),4,2,dimnames=list(c("mean_1",
"mean_2","concentration_1","concentration_2"))))
specifies (-1,1) bounds for the concentration parameters instead of the default [0,1) bounds.
- workBounds
An optional named list of 2-column matrices specifying bounds on the working scale of the probability distribution, transition probability, and initial distribution parameters. For each matrix, the first column pertains to the lower bound and the second column the upper bound.
For data streams, each element of workBounds
should be a k x 2 matrix with the same name of the corresponding element of
Par
, where k is the number of parameters. For transition probability parameters, the corresponding element of workBounds
must be a k x 2 matrix named ``beta'', where k=length(beta)
. For initial distribution parameters, the corresponding element of workBounds
must be a k x 2 matrix named ``delta'', where k=length(delta)
.
workBounds
is ignored for any given data stream unless DM
is also specified.
- betaRef
Numeric vector of length nbStates
indicating the reference elements for the t.p.m. multinomial logit link. Default: NULL, in which case
the diagonal elements of the t.p.m. are the reference. See fitHMM
.
- mvnCoords
Character string indicating the name of location data that are to be simulated using a multivariate normal distribution. For example, if mu="rw_mvnorm2"
was included in dist
and (mu.x, mu.y) are intended to be location data, then mvnCoords="mu"
needs to be specified in order for these data to be treated as such.
- stateNames
Optional character vector of length nbStates indicating state names.
- model
A momentuHMM
, momentuHierHMM
, miHMM
, or miSum
object. This option can be used to simulate from a fitted model. Default: NULL.
Note that, if this argument is specified, most other arguments will be ignored -- except for nbAnimals
,
obsPerAnimal
, states
, initialPosition
, lambda
, errorEllipse
, and, if covariate values different from those in the data should be specified,
covs
, spatialCovs
, centers
, and centroids
. It is not appropriate to simulate movement data from a model
that was fitted to latitude/longitude data (because simData
assumes Cartesian coordinates).
- states
TRUE
if the simulated states should be returned, FALSE
otherwise (default).
- retrySims
Number of times to attempt to simulate data within the spatial extent of spatialCovs
. If retrySims=0
(the default), an
error is returned if the simulated tracks(s) move beyond the extent(s) of the raster layer(s). Instead of relying on retrySims
, in many cases
it might be better to simply expand the extent of the raster layer(s) and/or adjust the step length and turning angle probability distributions.
Ignored if spatialCovs=NULL
.
- lambda
Observation rate for location data. If NULL
(the default), location data are obtained at regular intervals. Otherwise
lambda
is the rate parameter of the exponential distribution for the waiting times between successive location observations, i.e.,
1/lambda
is the expected time between successive location observations. Only the 'step' and 'angle' data streams are subject to temporal irregularity;
any other data streams are observed at temporally-regular intervals. Ignored unless a valid distribution for the 'step' data stream is specified.
- errorEllipse
List providing the upper bound for the semi-major axis (M
; on scale of x- and y-coordinates), semi-minor axis (m
;
on scale of x- and y-coordinates), and orientation (r
; in degrees) of location error ellipses. If NULL
(the default), no location
measurement error is simulated. If errorEllipse
is specified, then each observed location is subject to bivariate normal errors as described
in McClintock et al. (2015), where the components of the error ellipse for each location are randomly drawn from runif(1,min(errorEllipse$M),max(errorEllipse$M))
,
runif(1,min(errorEllipse$m),max(errorEllipse$m))
, and runif(1,min(errorEllipse$r),max(errorEllipse$r))
. If only a single value is provided for any of the
error ellipse elements, then the corresponding component is fixed to this value for each location. Only the 'step' and 'angle' data streams are subject to location measurement error;
any other data streams are observed without error. Ignored unless a valid distribution for the 'step' data stream is specified.
- ncores
Number of cores to use for parallel processing. Default: 1 (no parallel processing).
- hierStates
A hierarchical model structure Node
for the states ('state'). See details.
- hierDist
A hierarchical data structure Node
for the data streams ('dist'). Currently
supported distributions are 'bern', 'beta', 'exp', 'gamma', 'lnorm', 'norm', 'mvnorm2' (bivariate normal distribution), 'mvnorm3' (trivariate normal distribution),
'pois', 'rw_norm' (normal random walk), 'rw_mvnorm2' (bivariate normal random walk), 'rw_mvnorm3' (trivariate normal random walk), 'vm', 'vmConsensus', 'weibull', and 'wrpcauchy'. See details.
- hierBeta
A hierarchical data structure Node
for the matrix of initial values for the regression coefficients of the transition probabilities at each level of the hierarchy ('beta'). See fitHMM
.
- hierDelta
A hierarchical data structure Node
for the matrix of initial values for the regression coefficients of the initial distribution at each level of the hierarchy ('delta'). See fitHMM
.
- hierFormula
A hierarchical formula structure for the transition probability covariates for each level of the hierarchy ('formula'). Default: NULL
(only hierarchical-level effects, with no covariate effects).
Any formula terms that are not state- or parameter-specific are included on all of the transition probabilities within a given level of the hierarchy. See details.
- hierFormulaDelta
A hierarchical formula structure for the initial distribution covariates for each level of the hierarchy ('formulaDelta'). Default: NULL
(no covariate effects and fixPar$delta
is specified on the working scale).
- nbHierCovs
A hierarchical data structure Node
for the number of covariates ('nbCovs') to simulate for each level of the hierarchy (0 by default). Does not need to be specified if
covs
is specified. Simulated covariates are provided generic names (e.g., 'cov1.1' and 'cov1.2' for nbHierCovs$level1$nbCovs=2
) and can be included in hierFormula
and/or DM
.
- obsPerLevel
A hierarchical data structure Node
indicating the number of observations for each level of the hierarchy ('obs'). For each level, the 'obs' field can either be the number of observations per animal (if single value) or the bounds of the number of observations per animal (if vector of two values). In the latter case,
the numbers of obervations generated per level for each animal are uniformously picked from this interval. Alternatively, obsPerLevel
can be specified as
a list of length nbAnimals
with each element providing the hierarchical data structure for the number of observations for each level of the hierarchy for each animal, where the 'obs' field can either be the number of observations (if single value) or the bounds of the number of observations (if vector of two values) for each individual.