simData: Simulation tool

Description

Simulates data from a (multivariate) hidden Markov model. Movement data are assumed to be in Cartesian coordinates (not longitude/latitude) and can be generated with or without observation error attributable to temporal irregularity or location measurement error.

Usage

simData(
  nbAnimals = 1,
  nbStates = 2,
  dist,
  Par,
  beta = NULL,
  delta = NULL,
  formula = ~1,
  formulaDelta = NULL,
  mixtures = 1,
  formulaPi = NULL,
  covs = NULL,
  nbCovs = 0,
  spatialCovs = NULL,
  zeroInflation = NULL,
  oneInflation = NULL,
  circularAngleMean = NULL,
  centers = NULL,
  centroids = NULL,
  angleCovs = NULL,
  obsPerAnimal = c(500, 1500),
  initialPosition = c(0, 0),
  DM = NULL,
  userBounds = NULL,
  workBounds = NULL,
  betaRef = NULL,
  mvnCoords = NULL,
  stateNames = NULL,
  model = NULL,
  states = FALSE,
  retrySims = 0,
  lambda = NULL,
  errorEllipse = NULL,
  ncores = 1
)
simHierData(
  nbAnimals = 1,
  hierStates,
  hierDist,
  Par,
  hierBeta = NULL,
  hierDelta = NULL,
  hierFormula = NULL,
  hierFormulaDelta = NULL,
  mixtures = 1,
  formulaPi = NULL,
  covs = NULL,
  nbHierCovs = NULL,
  spatialCovs = NULL,
  zeroInflation = NULL,
  oneInflation = NULL,
  circularAngleMean = NULL,
  centers = NULL,
  centroids = NULL,
  angleCovs = NULL,
  obsPerLevel,
  initialPosition = c(0, 0),
  DM = NULL,
  userBounds = NULL,
  workBounds = NULL,
  mvnCoords = NULL,
  model = NULL,
  states = FALSE,
  retrySims = 0,
  lambda = NULL,
  errorEllipse = NULL,
  ncores = 1
)

Value

If the simulated data are temporally regular (i.e., lambda=NULL) with no measurement error (i.e., errorEllipse=NULL), an object momentuHMMData (or momentuHierHMMData), i.e., a dataframe of:

ID: The ID(s) of the observed animal(s)
...: Data streams as specified by dist (or hierDist)
x: Either easting or longitude (if data streams include valid non-negative distribution for 'step')
y: Either norting or latitude (if data streams include valid non-negative distribution for 'step')
...: Covariates (if any)

If simulated location data are temporally irregular (i.e., lambda>0) and/or include measurement error (i.e., errorEllipse!=NULL), a dataframe of:

time: Numeric time of each observed (and missing) observation
ID: The ID(s) of the observed animal(s)
x: Either easting or longitude observed location
y: Either norting or latitude observed location
...: Data streams that are not derived from location (if applicable)
...: Covariates at temporally-regular true (mux,muy) locations (if any)
mux: Either easting or longitude true location
muy: Either norting or latitude true location
error_semimajor_axis: error ellipse semi-major axis (if applicable)
error_semiminor_axis: error ellipse semi-minor axis (if applicable)
error_ellipse_orientation: error ellipse orientation (if applicable)
ln.sd.x: log of the square root of the x-variance of bivariate normal error (if applicable; required for error ellipse models in crawlWrap)
ln.sd.y: log of the square root of the y-variance of bivariate normal error (if applicable; required for error ellipse models in crawlWrap)
error.corr: correlation term of bivariate normal error (if applicable; required for error ellipse models in crawlWrap)

Arguments

nbAnimals

Number of observed individuals to simulate.

nbStates

Number of behavioural states to simulate.

dist

A named list indicating the probability distributions of the data streams. Currently supported distributions are 'bern', 'beta', 'cat', 'exp', 'gamma', 'lnorm', 'logis', 'negbinom', 'norm', 'mvnorm2' (bivariate normal distribution), 'mvnorm3' (trivariate normal distribution), 'pois', 'rw_norm' (normal random walk), 'rw_mvnorm2' (bivariate normal random walk), 'rw_mvnorm3' (trivariate normal random walk), 'vm', 'vmConsensus', 'weibull', and 'wrpcauchy'. For example, dist=list(step='gamma', angle='vm', dives='pois') indicates 3 data streams ('step', 'angle', and 'dives') and their respective probability distributions ('gamma', 'vm', and 'pois').

Par

A named list containing vectors of initial state-dependent probability distribution parameters for each data stream specified in dist. The parameters should be in the order expected by the pdfs of dist, and any zero-mass and/or one-mass parameters should be the last (if both are present, then zero-mass parameters must preceed one-mass parameters).

If DM is not specified for a given data stream, then Par is on the natural (i.e., real) scale of the parameters. However, if DM is specified for a given data stream, then Par must be on the working (i.e., beta) scale of the parameters, and the length of Par must match the number of columns in the design matrix. See details below.

beta

Matrix of regression parameters for the transition probabilities (more information in "Details").

delta

Initial value for the initial distribution of the HMM. Default: rep(1/nbStates,nbStates). If formulaDelta includes a formula, then delta must be specified as a k x (nbStates-1) matrix, where k is the number of covariates and the columns correspond to states 2:nbStates. See details below.

formula

Regression formula for the transition probability covariates. Default: ~1 (no covariate effect). In addition to allowing standard functions in R formulas (e.g., cos(cov), cov1*cov2, I(cov^2)), special functions include cosinor(cov,period) for modeling cyclical patterns, spline functions (bs, ns, bSpline, cSpline, iSpline, and mSpline), and state- or parameter-specific formulas (see details). Any formula terms that are not state- or parameter-specific are included on all of the transition probabilities.

formulaDelta

Regression formula for the initial distribution. Default: NULL (no covariate effects and delta is specified on the real scale). Standard functions in R formulas are allowed (e.g., cos(cov), cov1*cov2, I(cov^2)). When any formula is provided, then delta must be specified on the working scale.

mixtures

Number of mixtures for the state transition probabilities (i.e. discrete random effects *sensu* DeRuiter et al. 2017). Default: mixtures=1.

formulaPi

Regression formula for the mixture distribution probabilities. Default: NULL (no covariate effects; both beta$pi and fixPar$pi are specified on the real scale). Standard functions in R formulas are allowed (e.g., cos(cov), cov1*cov2, I(cov^2)). When any formula is provided, then both beta$pi and fixPar$pi are specified on the working scale. Note that only the covariate values corresponding to the first time step for each individual ID are used (i.e. time-varying covariates cannot be used for the mixture probabilties).

covs

Covariate values to include in the simulated data, as a dataframe. The names of any covariates specified by covs can be included in formula and/or DM. Covariates can also be simulated according to a standard normal distribution, by setting covs to NULL (the default), and specifying nbCovs>0.

nbCovs

Number of covariates to simulate (0 by default). Does not need to be specified if covs is specified. Simulated covariates are provided generic names (e.g., 'cov1' and 'cov2' for nbCovs=2) and can be included in formula and/or DM.

spatialCovs

List of raster objects for spatio-temporally referenced covariates. Covariates specified by spatialCovs are extracted from the raster layer(s) based on any simulated location data (and the z values for a raster stack or brick) for each time step. If an element of spatialCovs is a raster stack or brick, then z values must be set using raster::setZ and covs must include column(s) of the corresponding z value(s) for each observation (e.g., 'time'). The names of the raster layer(s) can be included in formula and/or DM. Note that simData usually takes longer to generate simulated data when spatialCovs is specified.

zeroInflation

A named list of logicals indicating whether the probability distributions of the data streams should be zero-inflated. If zeroInflation is TRUE for a given data stream, then values for the zero-mass parameters should be included in the corresponding element of Par.

oneInflation

A named list of logicals indicating whether the probability distributions of the data streams should be one-inflated. If oneInflation is TRUE for a given data stream, then values for the one-mass parameters should be included in the corresponding element of Par.

circularAngleMean

An optional named list indicating whether to use circular-linear (FALSE) or circular-circular (TRUE) regression on the mean of circular distributions ('vm' and 'wrpcauchy') for turning angles. For example, circularAngleMean=list(angle=TRUE) indicates the angle mean is be estimated for 'angle' using circular-circular regression. Whenever circular-circular regression is used for an angular data stream, a corresponding design matrix (DM) must be specified for the data stream, and the previous movement direction (i.e., a turning angle of zero) is automatically used as the reference angle (i.e., the intercept). Default is NULL, which assumes circular-linear regression is used for any angular distributions. Any circularAngleMean elements corresponding to data streams that do not have angular distributions are ignored. circularAngleMean is also ignored for any 'vmConsensus' data streams (because the consensus model is a circular-circular regression model).

Alternatively, circularAngleMean can be specified as a numeric scalar, where the value specifies the coefficient for the reference angle (i.e., directional persistence) term in the circular-circular regression model. For example, setting circularAngleMean to 0 specifies a circular-circular regression model with no directional persistence term (thus specifying a biased random walk instead of a biased correlated random walk). Setting circularAngleMean to 1 is equivalent to setting it to TRUE, i.e., a circular-circular regression model with a coefficient of 1 for the directional persistence reference angle.

centers

2-column matrix providing the x-coordinates (column 1) and y-coordinates (column 2) for any activity centers (e.g., potential centers of attraction or repulsion) from which distance and angle covariates will be calculated based on the simulated location data. These distance and angle covariates can be included in formula and DM using the row names of centers. If no row names are provided, then generic names are generated for the distance and angle covariates (e.g., 'center1.dist', 'center1.angle', 'center2.dist', 'center2.angle'); otherwise the covariate names are derived from the row names of centers as paste0(rep(rownames(centers),each=2),c(".dist",".angle")). Note that the angle covariates for each activity center are calculated relative to the previous movement direction instead of standard directions relative to the x-axis; this is to allow turning angles to be simulated as a function of these covariates using circular-circular regression.

centroids

List where each element is a data frame consisting of at least max(unlist(obsPerAnimal)) rows that provides the x-coordinates ('x') and y-coordinates ('y) for centroids (i.e., dynamic activity centers where the coordinates can change for each time step) from which distance and angle covariates will be calculated based on the simulated location data. These distance and angle covariates can be included in formula and DM using the names of centroids. If no list names are provided, then generic names are generated for the distance and angle covariates (e.g., 'centroid1.dist', 'centroid1.angle', 'centroid2.dist', 'centroid2.angle'); otherwise the covariate names are derived from the list names of centroids as paste0(rep(names(centroids),each=2),c(".dist",".angle")). Note that the angle covariates for each centroid are calculated relative to the previous movement direction instead of standard directions relative to the x-axis; this is to allow turning angles to be simulated as a function of these covariates using circular-circular regression.

angleCovs

Character vector indicating the names of any circular-circular regression angular covariates in covs or spatialCovs that need conversion from standard direction (in radians relative to the x-axis) to turning angle (relative to previous movement direction) using circAngles.

obsPerAnimal

Either the number of observations per animal (if single value) or the bounds of the number of observations per animal (if vector of two values). In the latter case, the numbers of obervations generated for each animal are uniformously picked from this interval. Alternatively, obsPerAnimal can be specified as a list of length nbAnimals with each element providing the number of observations (if single value) or the bounds (if vector of two values) for each individual. Default: c(500,1500).

initialPosition

2-vector providing the x- and y-coordinates of the initial position for all animals. Alternatively, initialPosition can be specified as a list of length nbAnimals with each element a 2-vector providing the x- and y-coordinates of the initial position for each individual. Default: c(0,0). If mvnCoord corresponds to a data stream with ``mvnorm3'' or ''rw_mvnorm3'' probability distributions, then initialPosition must be composed of 3-vector(s) for the x-, y-, and z-coordinates.

DM

An optional named list indicating the design matrices to be used for the probability distribution parameters of each data stream. Each element of DM can either be a named list of regression formulas or a ``pseudo'' design matrix. For example, for a 2-state model using the gamma distribution for a data stream named 'step', DM=list(step=list(mean=~cov1, sd=~1)) specifies the mean parameters as a function of the covariate 'cov1' for each state. This model could equivalently be specified as a 4x6 ``pseudo'' design matrix using character strings for the covariate: DM=list(step=matrix(c(1,0,0,0,'cov1',0,0,0,0,1,0,0,0,'cov1',0,0,0,0,1,0,0,0,0,1),4,6)) where the 4 rows correspond to the state-dependent paramaters (mean_1,mean_2,sd_1,sd_2) and the 6 columns correspond to the regression coefficients.

Design matrices specified using formulas allow standard functions in R formulas (e.g., cos(cov), cov1*cov2, I(cov^2)). Special formula functions include cosinor(cov,period) for modeling cyclical patterns, spline functions (bs, ns, bSpline, cSpline, iSpline, and mSpline), angleFormula(cov,strength,by) for the angle mean of circular-circular regression models, and state-specific formulas (see details). Any formula terms that are not state-specific are included on the parameters for all nbStates states.

userBounds

An optional named list of 2-column matrices specifying bounds on the natural (i.e, real) scale of the probability distribution parameters for each data stream. For example, for a 2-state model using the wrapped Cauchy ('wrpcauchy') distribution for a data stream named 'angle' with estAngleMean$angle=TRUE), userBounds=list(angle=matrix(c(-pi,-pi,-1,-1,pi,pi,1,1),4,2,dimnames=list(c("mean_1", "mean_2","concentration_1","concentration_2")))) specifies (-1,1) bounds for the concentration parameters instead of the default [0,1) bounds.

workBounds

An optional named list of 2-column matrices specifying bounds on the working scale of the probability distribution, transition probability, and initial distribution parameters. For each matrix, the first column pertains to the lower bound and the second column the upper bound. For data streams, each element of workBounds should be a k x 2 matrix with the same name of the corresponding element of Par, where k is the number of parameters. For transition probability parameters, the corresponding element of workBounds must be a k x 2 matrix named ``beta'', where k=length(beta). For initial distribution parameters, the corresponding element of workBounds must be a k x 2 matrix named ``delta'', where k=length(delta). workBounds is ignored for any given data stream unless DM is also specified.

betaRef

Numeric vector of length nbStates indicating the reference elements for the t.p.m. multinomial logit link. Default: NULL, in which case the diagonal elements of the t.p.m. are the reference. See fitHMM.

mvnCoords

Character string indicating the name of location data that are to be simulated using a multivariate normal distribution. For example, if mu="rw_mvnorm2" was included in dist and (mu.x, mu.y) are intended to be location data, then mvnCoords="mu" needs to be specified in order for these data to be treated as such.

stateNames

Optional character vector of length nbStates indicating state names.

model

A momentuHMM, momentuHierHMM, miHMM, or miSum object. This option can be used to simulate from a fitted model. Default: NULL. Note that, if this argument is specified, most other arguments will be ignored -- except for nbAnimals, obsPerAnimal, states, initialPosition, lambda, errorEllipse, and, if covariate values different from those in the data should be specified, covs, spatialCovs, centers, and centroids. It is not appropriate to simulate movement data from a model that was fitted to latitude/longitude data (because simData assumes Cartesian coordinates).

states

TRUE if the simulated states should be returned, FALSE otherwise (default).

retrySims

Number of times to attempt to simulate data within the spatial extent of spatialCovs. If retrySims=0 (the default), an error is returned if the simulated tracks(s) move beyond the extent(s) of the raster layer(s). Instead of relying on retrySims, in many cases it might be better to simply expand the extent of the raster layer(s) and/or adjust the step length and turning angle probability distributions. Ignored if spatialCovs=NULL.

lambda

Observation rate for location data. If NULL (the default), location data are obtained at regular intervals. Otherwise lambda is the rate parameter of the exponential distribution for the waiting times between successive location observations, i.e., 1/lambda is the expected time between successive location observations. Only the 'step' and 'angle' data streams are subject to temporal irregularity; any other data streams are observed at temporally-regular intervals. Ignored unless a valid distribution for the 'step' data stream is specified.

errorEllipse

List providing the upper bound for the semi-major axis (M; on scale of x- and y-coordinates), semi-minor axis (m; on scale of x- and y-coordinates), and orientation (r; in degrees) of location error ellipses. If NULL (the default), no location measurement error is simulated. If errorEllipse is specified, then each observed location is subject to bivariate normal errors as described in McClintock et al. (2015), where the components of the error ellipse for each location are randomly drawn from runif(1,min(errorEllipse$M),max(errorEllipse$M)), runif(1,min(errorEllipse$m),max(errorEllipse$m)), and runif(1,min(errorEllipse$r),max(errorEllipse$r)). If only a single value is provided for any of the error ellipse elements, then the corresponding component is fixed to this value for each location. Only the 'step' and 'angle' data streams are subject to location measurement error; any other data streams are observed without error. Ignored unless a valid distribution for the 'step' data stream is specified.

ncores

Number of cores to use for parallel processing. Default: 1 (no parallel processing).

hierStates

A hierarchical model structure Node for the states ('state'). See details.

hierDist

A hierarchical data structure Node for the data streams ('dist'). Currently supported distributions are 'bern', 'beta', 'exp', 'gamma', 'lnorm', 'norm', 'mvnorm2' (bivariate normal distribution), 'mvnorm3' (trivariate normal distribution), 'pois', 'rw_norm' (normal random walk), 'rw_mvnorm2' (bivariate normal random walk), 'rw_mvnorm3' (trivariate normal random walk), 'vm', 'vmConsensus', 'weibull', and 'wrpcauchy'. See details.

hierBeta

A hierarchical data structure Node for the matrix of initial values for the regression coefficients of the transition probabilities at each level of the hierarchy ('beta'). See fitHMM.

hierDelta

A hierarchical data structure Node for the matrix of initial values for the regression coefficients of the initial distribution at each level of the hierarchy ('delta'). See fitHMM.

hierFormula

A hierarchical formula structure for the transition probability covariates for each level of the hierarchy ('formula'). Default: NULL (only hierarchical-level effects, with no covariate effects). Any formula terms that are not state- or parameter-specific are included on all of the transition probabilities within a given level of the hierarchy. See details.

hierFormulaDelta

A hierarchical formula structure for the initial distribution covariates for each level of the hierarchy ('formulaDelta'). Default: NULL (no covariate effects and fixPar$delta is specified on the working scale).

nbHierCovs

A hierarchical data structure Node for the number of covariates ('nbCovs') to simulate for each level of the hierarchy (0 by default). Does not need to be specified if covs is specified. Simulated covariates are provided generic names (e.g., 'cov1.1' and 'cov1.2' for nbHierCovs$level1$nbCovs=2) and can be included in hierFormula and/or DM.

obsPerLevel

A hierarchical data structure Node indicating the number of observations for each level of the hierarchy ('obs'). For each level, the 'obs' field can either be the number of observations per animal (if single value) or the bounds of the number of observations per animal (if vector of two values). In the latter case, the numbers of obervations generated per level for each animal are uniformously picked from this interval. Alternatively, obsPerLevel can be specified as a list of length nbAnimals with each element providing the hierarchical data structure for the number of observations for each level of the hierarchy for each animal, where the 'obs' field can either be the number of observations (if single value) or the bounds of the number of observations (if vector of two values) for each individual.

Details

simHierData is very similar to simData except that instead of simply specifying the number of states (nbStates), distributions (dist), observations (obsPerAnimal), covariates (nbCovs), and a single t.p.m. formula (formula), the hierStates argument specifies the hierarchical nature of the states, the hierDist argument specifies the hierarchical nature of the data streams, the obsPerLevel argument specifies the number of observations for each level of the hierarchy, the nbHierCovs argument specifies the number of covariates for each level of the hierarchy, and the hierFormula argument specifies a t.p.m. formula for each level of the hierarchy. All of the hierarhcial arguments in simHierData are specified as Node objects from the data.tree package.
x- and y-coordinate location data are generated only if valid 'step' and 'angle' data streams are specified. Vaild distributions for 'step' include 'gamma', 'weibull', 'exp', and 'lnorm'. Valid distributions for 'angle' include 'vm' and 'wrpcauchy'. If only a valid 'step' data stream is specified, then only x-coordinates are generated.
If DM is specified for a particular data stream, then the initial values are specified on the working (i.e., beta) scale of the parameters. The working scale of each parameter is determined by the link function used. The function getParDM is intended to help with obtaining initial values on the working scale when specifying a design matrix and other parameter constraints.
Simulated data that are temporally regular (i.e., lambda=NULL) and without location measurement error (i.e., errorEllipse=NULL) are returned as a momentuHMMData (or momentuHierHMMData) object suitable for analysis using fitHMM.
Simulated location data that are temporally-irregular (i.e., lambda>0) and/or with location measurement error (i.e., errorEllipse!=NULL) are returned as a data frame suitable for analysis using crawlWrap.
The matrix beta of regression coefficients for the transition probabilities has one row for the intercept, plus one row for each covariate, and one column for each non-diagonal element of the transition probability matrix. For example, in a 3-state HMM with 2 formula covariates, the matrix beta has three rows (intercept + two covariates) and six columns (six non-diagonal elements in the 3x3 transition probability matrix - filled in row-wise). In a covariate-free model (default), beta has one row, for the intercept.
State-specific formulas can be specified in DM using special formula functions. These special functions can take the names paste0("state",1:nbStates) (where the integer indicates the state-specific formula). For example, DM=list(step=list(mean=~cov1+state1(cov2),sd=~cov2+state2(cov1))) includes cov1 on the mean parameter for all states, cov2 on the mean parameter for state 1, cov2 on the sd parameter for all states, and cov1 on the sd parameter for state 2.
State- and parameter-specific formulas can be specified for transition probabilities in formula using special formula functions. These special functions can take the names paste0("state",1:nbStates) (where the integer indicates the current state from which transitions occur), paste0("toState",1:nbStates) (where the integer indicates the state to which transitions occur), or paste0("betaCol",nbStates*(nbStates-1)) (where the integer indicates the column of the beta matrix). For example with nbStates=3, formula=~cov1+betaCol1(cov2)+state3(cov3)+toState1(cov4) includes cov1 on all transition probability parameters, cov2 on the beta column corresponding to the transition from state 1->2, cov3 on transition probabilities from state 3 (i.e., beta columns corresponding to state transitions 3->1 and 3->2), and cov4 on transition probabilities to state 1 (i.e., beta columns corresponding to state transitions 2->1 and 3->1).
Cyclical relationships (e.g., hourly, monthly) may be simulated using the consinor(x,period) special formula function for covariate x and sine curve period of time length period. For example, if the data are hourly, a 24-hour cycle can be simulated using ~cosinor(cov1,24), where the covariate cov1 is a repeating series of integers 0,1,...,23,0,1,...,23,0,1,... (note that simData will not do this for you, the appropriate covariate must be specified using the covs argument; see example below). The cosinor(x,period) function converts x to 2 covariates cosinorCos(x)=cos(2*pi*x/period) and consinorSin(x)=sin(2*pi*x/period for inclusion in the model (i.e., 2 additional parameters per state). The amplitude of the sine wave is thus sqrt(B_cos^2 + B_sin^2), where B_cos and B_sin are the working parameters correponding to cosinorCos(x) and cosinorSin(x), respectively (e.g., see Cornelissen 2014).

When the circular-circular regression model is used, the special function angleFormula(cov,strength,by) can be used in DM for the mean of angular distributions (i.e. 'vm', 'vmConsensus', and 'wrpcauchy'), where cov is an angle covariate (e.g. wind direction), strength is a positive real covariate (e.g. wind speed), and by is an optional factor variable for individual- or group-level effects (e.g. ID, sex). This allows angle covariates to be weighted based on their strength or importance at time step t as in Rivest et al. (2016).

If the length of covariate values passed (either through 'covs', or 'model') is not the same as the number of observations suggested by 'nbAnimals' and 'obsPerAnimal' (or 'obsPerLevel' for simHierData), then the series of covariates is either shortened (removing last values - if too long) or extended (starting over from the first values - if too short).
For simData, when covariates are not included in formulaDelta (i.e. formulaDelta=NULL), then delta is specified as a vector of length nbStates that sums to 1. When covariates are included in formulaDelta, then delta must be specified as a k x (nbStates-1) matrix of working parameters, where k is the number of regression coefficients and the columns correspond to states 2:nbStates. For example, in a 3-state HMM with formulaDelta=~cov1+cov2, the matrix delta has three rows (intercept + two covariates) and 2 columns (corresponding to states 2 and 3). The initial distribution working parameters are transformed to the real scale as exp(covsDelta*Delta)/rowSums(exp(covsDelta*Delta)), where covsDelta is the N x k design matrix, Delta=cbind(rep(0,k),delta) is a k x nbStates matrix of working parameters, and N=length(unique(data$ID)).
For simHierData, delta must be specified as a k x (nbStates-1) matrix of working parameters, where k is the number of regression coefficients and the columns correspond to states 2:nbStates.

References

Cornelissen, G. 2014. Cosinor-based rhythmometry. Theoretical Biology and Medical Modelling 11:16.

McClintock BT, London JM, Cameron MF, Boveng PL. 2015. Modelling animal movement using the Argos satellite telemetry location error ellipse. Methods in Ecology and Evolution 6(3):266-277.

Rivest, LP, Duchesne, T, Nicosia, A, Fortin, D, 2016. A general angular regression model for the analysis of data on animal movement in ecology. Journal of the Royal Statistical Society: Series C (Applied Statistics), 65(3):445-463.

Leos-Barajas, V., Gangloff, E.J., Adam, T., Langrock, R., van Beest, F.M., Nabe-Nielsen, J. and Morales, J.M. 2017. Multi-scale modeling of animal movement and general behavior data using hidden Markov models with hierarchical structures. Journal of Agricultural, Biological and Environmental Statistics, 22 (3), 232-248.

Examples

Run this code

# 1. Pass a fitted model to simulate from
# (m is a momentuHMM object - as returned by fitHMM - automatically loaded with the package)
# We keep the default nbAnimals=1.
m <- example$m
obsPerAnimal=c(50,100)
data <- simData(model=m,obsPerAnimal=obsPerAnimal)

if (FALSE) {
# 2. Pass the parameters of the model to simulate from
stepPar <- c(1,10,1,5,0.2,0.3) # mean_1, mean_2, sd_1, sd_2, zeromass_1, zeromass_2
anglePar <- c(pi,0,0.5,2) # mean_1, mean_2, concentration_1, concentration_2
omegaPar <- c(1,10,10,1) # shape1_1, shape1_2, shape2_1, shape2_2
stepDist <- "gamma"
angleDist <- "vm"
omegaDist <- "beta"
data <- simData(nbAnimals=4,nbStates=2,dist=list(step=stepDist,angle=angleDist,omega=omegaDist),
                Par=list(step=stepPar,angle=anglePar,omega=omegaPar),nbCovs=2,
                zeroInflation=list(step=TRUE),
                obsPerAnimal=obsPerAnimal)

# 3. Include covariates
# (note that it is useless to specify "nbCovs", which are overruled
# by the number of columns of "cov")
cov <- data.frame(temp=log(rnorm(500,20,5)))
stepPar <- c(log(10),0.1,log(100),-0.1,log(5),log(25)) # working scale parameters for step DM
anglePar <- c(pi,0,0.5,2) # mean_1, mean_2, concentration_1, concentration_2
stepDist <- "gamma"
angleDist <- "vm"
data <- simData(nbAnimals=2,nbStates=2,dist=list(step=stepDist,angle=angleDist),
                Par=list(step=stepPar,angle=anglePar),
                DM=list(step=list(mean=~temp,sd=~1)),
                covs=cov,
                obsPerAnimal=obsPerAnimal)
                
# 4. Include example 'forest' spatial covariate raster layer
# nbAnimals and obsPerAnimal kept small to reduce example run time
spatialCov<-list(forest=forest)
data <- simData(nbAnimals=1,nbStates=2,dist=list(step=stepDist,angle=angleDist),
                Par=list(step=c(100,1000,50,100),angle=c(0,0,0.1,5)),
                beta=matrix(c(5,-10,-25,50),nrow=2,ncol=2,byrow=TRUE),
                formula=~forest,spatialCovs=spatialCov,
                obsPerAnimal=250,states=TRUE,
                retrySims=100)
                
# 5. Specify design matrix for 'omega' data stream
# natural scale parameters for step and angle
stepPar <- c(1,10,1,5) # shape_1, shape_2, scale_1, scale_2
anglePar <- c(pi,0,0.5,0.7) # mean_1, mean_2, concentration_1, concentration_2

# working scale parameters for omega DM
omegaPar <- c(log(1),0.1,log(10),-0.1,log(10),-0.1,log(1),0.1)

stepDist <- "weibull"
angleDist <- "wrpcauchy"
omegaDist <- "beta"

data <- simData(nbStates=2,dist=list(step=stepDist,angle=angleDist,omega=omegaDist),
                Par=list(step=stepPar,angle=anglePar,omega=omegaPar),nbCovs=2,
                DM=list(omega=list(shape1=~cov1,shape2=~cov2)),
                obsPerAnimal=obsPerAnimal,states=TRUE)
                
# 6. Include temporal irregularity and location measurement error
lambda <- 2 # expect 2 observations per time step
errorEllipse <- list(M=50,m=25,r=180)
obsData <- simData(model=m,obsPerAnimal=obsPerAnimal,
                   lambda=lambda, errorEllipse=errorEllipse)
                   
# 7. Cosinor and state-dependent formulas
nbStates<-2
dist<-list(step="gamma")
Par<-list(step=c(100,1000,50,100))

# include 24-hour cycle on all transition probabilities
# include 12-hour cycle on transitions from state 2
formula=~cosinor(hour24,24)+state2(cosinor(hour12,12))

# specify appropriate covariates
covs<-data.frame(hour24=0:23,hour12=0:11)

beta<-matrix(c(-1.5,1,1,NA,NA,-1.5,-1,-1,1,1),5,2)
# row names for beta not required but can be helpful
rownames(beta)<-c("(Intercept)",
                  "cosinorCos(hour24, 24)",
                  "cosinorSin(hour24, 24)",
                  "cosinorCos(hour12, 12)",
                  "cosinorSin(hour12, 12)")
data.cos<-simData(nbStates=nbStates,dist=dist,Par=Par,
                  beta=beta,formula=formula,covs=covs)     
                  
# 8. Piecewise constant B-spline on step length mean and angle concentration
nObs <- 1000 # length of simulated track
cov <- data.frame(time=1:nObs) # time covariate for splines
dist <- list(step="gamma",angle="vm")
stepDM <- list(mean=~splines2::bSpline(time,df=2,degree=0),sd=~1)
angleDM <- list(mean=~1,concentration=~splines2::bSpline(time,df=2,degree=0))
DM <- list(step=stepDM,angle=angleDM)
Par <- list(step=c(log(1000),1,-1,log(100)),angle=c(0,log(10),2,-5))

data.spline<-simData(obsPerAnimal=nObs,nbStates=1,dist=dist,Par=Par,DM=DM,covs=cov)        

# 9. Initial state (delta) based on covariate
nObs <- 100
dist <- list(step="gamma",angle="vm")
Par <- list(step=c(100,1000,50,100),angle=c(0,0,0.01,0.75))

# create sex covariate
cov <- data.frame(sex=factor(rep(c("F","M"),each=nObs))) # sex covariate
formulaDelta <- ~ sex + 0

# Female begins in state 1, male begins in state 2
delta <- matrix(c(-100,100),2,1,dimnames=list(c("sexF","sexM"),"state 2")) 

data.delta<-simData(nbAnimals=2,obsPerAnimal=nObs,nbStates=2,dist=dist,Par=Par,
                    delta=delta,formulaDelta=formulaDelta,covs=cov,
                    beta=matrix(-1.5,1,2),states=TRUE)        
}

Run the code above in your browser using DataLab