fitHMM: Fit a multivariate HMM to the data

Description

Fit a (multivariate) hidden Markov model to the data provided, using numerical optimization of the log-likelihood function.

Usage

fitHMM(data, nbStates, dist, Par0, beta0 = NULL, delta0 = NULL,
  estAngleMean = NULL, circularAngleMean = NULL, formula = ~1,
  formulaDelta = ~1, stationary = FALSE, verbose = 0, nlmPar = NULL,
  fit = TRUE, DM = NULL, cons = NULL, userBounds = NULL,
  workcons = NULL, stateNames = NULL, knownStates = NULL, fixPar = NULL,
  retryFits = 0)

Arguments

data

An object momentuHMMData.

nbStates

Number of states of the HMM.

dist

A named list indicating the probability distributions of the data streams. Currently supported distributions are 'bern', 'beta', 'exp', 'gamma', 'lnorm', 'norm', 'pois', 'vm', 'weibull', and 'wrpcauchy'. For example, dist=list(step='gamma', angle='vm', dives='pois') indicates 3 data streams ('step', 'angle', and 'dives') and their respective probability distributions ('gamma', 'vm', and 'pois'). The names of the data streams (e.g., 'step', 'angle', 'dives') must match component names in data.

Par0

A named list containing vectors of initial state-dependent probability distribution parameters for each data stream specified in dist. The parameters should be in the order expected by the pdfs of dist, and any zero-mass and/or one-mass parameters should be the last (if both are present, then zero-mass parameters must preceed one-mass parameters). Note that zero-mass parameters are mandatory if there are zeros in data streams with a 'gamma','weibull','exp','lnorm', or 'beta' distribution, and one-mass parameters are mandatory if there are ones in data streams with a 'beta' distribution. For example, for a 2-state model using the Von Mises (vm) distribution for a data stream named 'angle' and the zero-inflated gamma distribution for a data stream named 'step', the vector of initial parameters would be something like: Par0=list(step=c(mean_1,mean_2,sd_1,sd_2,zeromass_1,zeromass_2), angle=c(mean_1,mean_2,concentration_1,concentration_2)).

If DM is not specified for a given data stream, then Par0 is on the natural (i.e., real) scale of the parameters. However, if DM is specified for a given data stream, then Par0 must be on the working (i.e., beta) scale of the parameters, and the length of Par0 must match the number of columns in the design matrix. See details below.

beta0

Initial matrix of regression coefficients for the transition probabilities (more information in 'Details'). Default: NULL. If not specified, beta0 is initialized such that the diagonal elements of the transition probability matrix are dominant.

delta0

Initial value for the initial distribution of the HMM. Default: rep(1/nbStates,nbStates). If formulaDelta includes covariates, then delta0 must be specified as a k x (nbStates-1) matrix, where k is the number of covariates and the columns correspond to states 2:nbStates. See details below.

estAngleMean

An optional named list indicating whether or not to estimate the angle mean for data streams with angular distributions ('vm' and 'wrpcauchy'). For example, estAngleMean=list(angle=TRUE) indicates the angle mean is to be estimated for 'angle'. Default is NULL, which assumes any angle means are fixed to zero and are not to be estimated. Any estAngleMean elements corresponding to data streams that do not have angular distributions are ignored.

circularAngleMean

An optional named list indicating whether to use circular-linear (FALSE) or circular-circular (TRUE) regression on the mean of circular distributions ('vm' and 'wrpcauchy') for turning angles. For example, circularAngleMean=list(angle=TRUE) indicates the angle mean is be estimated for 'angle' using circular-circular regression. Whenever circular-circular regression is used for an angular data stream, a corresponding design matrix (DM) must be specified for the data stream, and the previous movement direction (i.e., a turning angle of zero) is automatically used as the reference angle (i.e., the intercept). Any circular-circular regression covariates in data should therefore be relative to the previous direction of movement (instead of standard directions relative to the x-axis; see prepData and circAngles). See Duchesne et al. (2015) for specifics on the circular-circular regression model using previous movement direction as the reference angle. Default is NULL, which assumes circular-linear regression is used for any angular distributions for which the mean angle is to be estimated. circularAngleMean elements corresponding to angular data streams are ignored unless the corresponding element of estAngleMean is TRUE. Any circularAngleMean elements corresponding to data streams that do not have angular distributions are ignored.

formula

Regression formula for the transition probability covariates. Default: ~1 (no covariate effect). In addition to allowing standard functions in R formulas (e.g., cos(cov), cov1*cov2, I(cov^2)), special functions include cosinor(cov,period) for modeling cyclical patterns, spline functions (bs, ns, bSpline, cSpline, iSpline, and mSpline), and state- or parameter-specific formulas (see details). Any formula terms that are not state- or parameter-specific are included on all of the transition probabilities.

formulaDelta

Regression formula for the initial distribution. Default: ~1 (no covariate effect). Standard functions in R formulas are allowed (e.g., cos(cov), cov1*cov2, I(cov^2)).

stationary

FALSE if there are covariates in formula or formulaDelta. If TRUE, the initial distribution is considered equal to the stationary distribution. Default: FALSE.

verbose

Determines the print level of the optimizer. The default value of 0 means that no printing occurs, a value of 1 means that the first and last iterations of the optimization are detailed, and a value of 2 means that each iteration of the optimization is detailed.

nlmPar

List of parameters to pass to the optimization function nlm (which should be either 'gradtol', 'stepmax', 'steptol', or 'iterlim' -- see nlm's documentation for more detail)

fit

TRUE if an HMM should be fitted to the data, FALSE otherwise. If fit=FALSE, a model is returned with the MLE replaced by the initial parameters given in input. This option can be used to assess the initial parameters, parameter bounds, etc. Default: TRUE.

An optional named list indicating the design matrices to be used for the probability distribution parameters of each data stream. Each element of DM can either be a named list of linear regression formulas or a ``pseudo'' design matrix. For example, for a 2-state model using the gamma distribution for a data stream named 'step', DM=list(step=list(mean=~cov1, sd=~1)) specifies the mean parameters as a function of the covariate 'cov1' for each state. This model could equivalently be specified as a 4x6 ``pseudo'' design matrix using character strings for the covariate: DM=list(step=matrix(c(1,0,0,0,'cov1',0,0,0,0,1,0,0,0,'cov1',0,0,0,0,1,0,0,0,0,1),4,6)) where the 4 rows correspond to the state-dependent paramaters (mean_1,mean_2,sd_1,sd_2) and the 6 columns correspond to the regression coefficients.

Design matrices specified using formulas allow standard functions in R formulas (e.g., cos(cov), cov1*cov2, I(cov^2)). Special formula functions include cosinor(cov,period) for modeling cyclical patterns, spline functions (bs, ns, bSpline, cSpline, iSpline, and mSpline), and state-specific formulas (see details). Any formula terms that are not state-specific are included on the parameters for all nbStates states.

cons

An optional named list of vectors specifying a power to raise parameters corresponding to each column of the design matrix for each data stream. While there could be other uses, primarily intended to constrain specific parameters to be positive. For example, cons=list(step=c(1,2,1,1)) raises the second parameter to the second power. Default=NULL, which simply raises all parameters to the power of 1. cons is ignored for any given data stream unless DM is specified.

userBounds

An optional named list of 2-column matrices specifying bounds on the natural (i.e, real) scale of the probability distribution parameters for each data stream. For example, for a 2-state model using the wrapped Cauchy ('wrpcauchy') distribution for a data stream named 'angle' with estAngleMean$angle=TRUE), userBounds=list(angle=matrix(c(-pi,-pi,-1,-1,pi,pi,1,1),4,2,dimnames=list(c("mean_1", "mean_2","concentration_1","concentration_2")))) specifies (-1,1) bounds for the concentration parameters instead of the default [0,1) bounds.

workcons

An optional named list of vectors specifying constants to add to the regression coefficients on the working scale for each data stream. Warning: use of workcons is recommended only for advanced users implementing unusual parameter constraints through a combination of DM, cons, and workcons. workcons is ignored for any given data stream unless DM is specified.

stateNames

Optional character vector of length nbStates indicating state names.

knownStates

Vector of values of the state process which are known prior to fitting the model (if any). Default: NULL (states are not known). This should be a vector with length the number of rows of 'data'; each element should either be an integer (the value of the known states) or NA if the state is not known.

fixPar

An optional list of vectors indicating parameters which are assumed known prior to fitting the model. Default: NULL (no parameters are fixed). For data streams, each element of fixPar should be a vector of the same name and length as the corresponding element of Par0. For transition probability parameters, the corresponding element of fixPar must be named ``beta'' and have the same dimensions as beta0. For initial distribution parameters, the corresponding element of fixPar must be named ``delta'' and have the same dimensions as delta0. Each parameter should either be numeric (the fixed value of the parameter) or NA if the parameter is to be estimated. For each data stream, fixPar parameters must be on the same scale as Par0 (e.g. if DM is specified for a given data stream, any fixed parameters for this data stream must be on the working scale). Any fixed values for the transition probabilities (beta) must be on the working scale (i.e. the same scale as beta0). Any fixed values for the initial distribution (delta) must be on the natural scale (i.e. the same scale as delta0).

retryFits

Non-negative integer indicating the number of times to attempt to iteratively fit the model using random perturbations of the current parameter estimates as the initial values for likelihood optimization. Standard normal perturbations are used on the working scale probability distribution parameters, while Normal(0,10^2) pertubations are used for working scale transition probability parameters. Default: 0. When retryFits>0, the model with the largest log likelihood value is returned. Ignored if fit=FALSE.

Value

A momentuHMM object, i.e. a list of:

mle

A named list of the maximum likelihood estimates of the parameters of the model (if the numerical algorithm has indeed identified the global maximum of the likelihood function). Elements are included for the parameters of each data strea, as well as beta (transition probabilities regression coefficients - more information in 'Details'), gamma (transition probabilities on real scale, based on mean covariate values if formula includes covariates), and delta (initial distribution).

CIreal

Standard errors and 95% confidence intervals on the real (i.e., natural) scale of parameters

CIbeta

Standard errors and 95% confidence intervals on the beta (i.e., working) scale of parameters

data

The momentuHMMData object

mod

The object returned by the numerical optimizer nlm

conditions

Conditions used to fit the model, e.g., bounds (parameter bounds), distributions, zeroInflation, estAngleMean, stationary, formula, DM, fullDM (full design matrix), etc.)

rawCovs

Raw covariate values for transition probabilities, as found in the data (if any). Used in plot.momentuHMM.

stateNames

The names of the states.

knownStates

Vector of values of the state process which are known.

covsDelta

Design matrix for initial distribution.

Details

The matrix beta0 of regression coefficients for the transition probabilities has one row for the intercept, plus one row for each covariate, and one column for each non-diagonal element of the transition probability matrix. For example, in a 3-state HMM with 2 formula covariates, the matrix beta has three rows (intercept + two covariates) and six columns (six non-diagonal elements in the 3x3 transition probability matrix - filled in row-wise). In a covariate-free model (default), beta0 has one row, for the intercept.
When covariates are not included in formulaDelta (i.e. formulaDelta=~1), then delta0 (and fixPar$delta) are specified as a vector of length nbStates that sums to 1. When covariates are included in formulaDelta, then delta0 (and fixPar$delta) must be specified as a k x (nbStates-1) matrix of working parameters, where k is the number of regression coefficients and the columns correspond to states 2:nbStates. For example, in a 3-state HMM with formulaDelta=~cov1+cov2, the matrix delta0 has three rows (intercept + two covariates) and 2 columns (corresponding to states 2 and 3). The initial distribution working parameters are transformed to the real scale as exp(covsDelta*Delta)/rowSums(exp(covsDelta*Delta)), where covsDelta is the N x k design matrix, Delta=cbind(rep(0,k),delta0) is a k x nbStates matrix of working parameters, and N=length(unique(data$ID)).
The choice of initial parameters (particularly Par0 and beta0) is crucial to fit a model. The algorithm might not find the global optimum of the likelihood function if the initial parameters are poorly chosen.
If DM is specified for a particular data stream, then the initial values are specified on the working (i.e., beta) scale of the parameters. The working scale of each parameter is determined by the link function used. If a parameter P is bound by (0,Inf) then the working scale is the log(P) scale. If the parameter bounds are (-pi,pi) then the working scale is tan(P/2) unless circular-circular regression is used. Otherwise if the parameter bounds are finite then logit(P) is the working scale. The function getParDM is intended to help with obtaining initial values on the working scale when specifying a design matrix and other parameter constraints (see example below). When circular-circular regression is specified using circularAngleMean, the working scale for the mean turning angle is not as easily interpretable, but the link function is atan2(sin(X)*B,1+cos(X)*B), where X are the angle covariates and B the angle coefficients (see Duchesne et al. 2015). Under this formulation, the reference turning angle is 0 (i.e., movement in the same direction as the previous time step). In other words, the mean turning angle is zero when the coefficient(s) B=0.
Circular-circular regression in momentuHMM is designed for turning angles (not bearings) as computed by simData and prepData. Any circular-circular regression covariates for time step t should therefore be relative to the previous direction of movement for time step t-1. In other words, circular-circular regression covariates for time step t should be the turning angle between the direction of movement for time step t-1 and the standard direction of the covariate relative to the x-axis for time step t. If provided standard directions in radians relative to the x-axis (where 0 = east, pi/2 = north, pi = west, and -pi/2 = south), circAngles or prepData can perform this calculation for you.
State-specific formulas can be specified in DM using special formula functions. These special functions can take the names paste0("state",1:nbStates) (where the integer indicates the state-specific formula). For example, DM=list(step=list(mean=~cov1+state1(cov2),sd=~cov2+state2(cov1))) includes cov1 on the mean parameter for all states, cov2 on the mean parameter for state 1, cov2 on the sd parameter for all states, and cov1 on the sd parameter for state 2.
State- and parameter-specific formulas can be specified for transition probabilities in formula using special formula functions. These special functions can take the names paste0("state",1:nbStates) (where the integer indicates the current state from which transitions occur), or paste0("betaCol",nbStates*(nbStates-1)) (where the integer indicates the column of the beta matrix). For example with nbStates=3, formula=~cov1+betaCol1(cov2)+state3(cov3) includes cov1 on all transition probability parameters, cov2 on the beta column corresponding to the transition from state 1->2, and cov3 on transition probabilities from state 3 (i.e., beta columns corresponding to state transitions 3->1 and 3->2).
Cyclical relationships (e.g., hourly, monthly) may be modeled in DM or formula using the cosinor(x,period) special formula function for covariate x and sine curve period of time length period. For example, if the data are hourly, a 24-hour cycle can be modeled using ~cosinor(cov1,24), where the covariate cov1 is a repeating sequential series of integers indicating the hour of day (0,1,...,23,0,1,...,23,0,1,...) (note that fitHMM will not do this for you, the appropriate covariate must be included in data; see example below). The cosinor(x,period) function converts x to 2 covariates cosinorCos(x)=cos(2*pi*x/period) and cosinorSin(x)=sin(2*pi*x/period for inclusion in the model (i.e., 2 additional parameters per state). The amplitude of the sine wave is thus sqrt(B_cos^2 + B_sin^2), where B_cos and B_sin are the working parameters correponding to cosinorCos(x) and cosinorSin(x), respectively (e.g., see Cornelissen 2014).

References

Cornelissen, G. 2014. Cosinor-based rhythmometry. Theoretical Biology and Medical Modelling 11:16.

Duchesne, T., Fortin, D., Rivest L-P. 2015. Equivalence between step selection functions and biased correlated random walks for statistical inference on animal movement. PLoS ONE 10 (4): e0122947.

Langrock R., King R., Matthiopoulos J., Thomas L., Fortin D., Morales J.M. 2012. Flexible and practical modeling of animal telemetry data: hidden Markov models and extensions. Ecology, 93 (11), 2336-2342.

McClintock B.T., King R., Thomas L., Matthiopoulos J., McConnell B.J., Morales J.M. 2012. A general discrete-time modeling framework for animal movement using multistate random walks. Ecological Monographs, 82 (3), 335-349.

McClintock B.T., Russell D.J., Matthiopoulos J., King R. 2013. Combining individual animal movement and ancillary biotelemetry data to investigate population-level activity budgets. Ecology, 94 (4), 838-849.

Patterson T.A., Basson M., Bravington M.V., Gunn J.S. 2009. Classifying movement behaviour in relation to environmental conditions using hidden Markov models. Journal of Animal Ecology, 78 (6), 1113-1123.

Examples

Run this code

# NOT RUN {
nbStates <- 2
stepDist <- "gamma" # step distribution
angleDist <- "vm" # turning angle distribution

# extract data from momentuHMM example
data <- example$m$data

### 1. fit the model to the simulated data
# define initial values for the parameters
mu0 <- c(20,70)
sigma0 <- c(10,30)
kappa0 <- c(1,1)
stepPar <- c(mu0,sigma0) # no zero-inflation, so no zero-mass included
anglePar <- kappa0 # not estimating angle mean, so not included
formula <- ~cov1+cos(cov2)

m <- fitHMM(data=data,nbStates=nbStates,dist=list(step=stepDist,angle=angleDist),
            Par0=list(step=stepPar,angle=anglePar),formula=formula)

print(m)

# }
# NOT RUN {
### 2. fit the exact same model to the simulated data using DM formulas
# Get initial values for the parameters on working scale
Par0 <- getParDM(data=data,nbStates=nbStates,dist=list(step=stepDist,angle=angleDist),
        Par=list(step=stepPar,angle=anglePar),
        DM=list(step=list(mean=~1,sd=~1),angle=list(concentration=~1)))

mDMf <- fitHMM(data=data,nbStates=nbStates,dist=list(step=stepDist,angle=angleDist),
              Par0=Par0,formula=formula,
              DM=list(step=list(mean=~1,sd=~1),angle=list(concentration=~1)))

print(mDMf)

### 3. fit the exact same model to the simulated data using DM matrices
# define DM
DMm <- list(step=diag(4),angle=diag(2))

# user-specified dimnames not required but are recommended
dimnames(DMm$step) <- list(c("mean_1","mean_2","sd_1","sd_2"),
                   c("mean_1:(Intercept)","mean_2:(Intercept)",
                   "sd_1:(Intercept)","sd_2:(Intercept)"))
dimnames(DMm$angle) <- list(c("concentration_1","concentration_2"),
                    c("concentration_1:(Intercept)","concentration_2:(Intercept)"))
                  
mDMm <- fitHMM(data=data,nbStates=nbStates,dist=list(step=stepDist,angle=angleDist),
              Par0=Par0,formula=formula,
              DM=DMm)

print(mDMm)

### 4. fit step mean parameter covariate model to the simulated data using DM
stepDMf <- list(mean=~cov1,sd=~1)
Par0 <- getParDM(data,nbStates,list(step=stepDist,angle=angleDist),
                 Par=list(step=stepPar,angle=anglePar),
                 DM=list(step=stepDMf,angle=DMm$angle))
mDMfcov <- fitHMM(data=data,nbStates=nbStates,dist=list(step=stepDist,angle=angleDist),
              Par0=Par0,
              formula=formula,
              DM=list(step=stepDMf,angle=DMm$angle))

print(mDMfcov)

### 5. fit the exact same step mean parameter covariate model using DM matrix
stepDMm <- matrix(c(1,0,0,0,"cov1",0,0,0,0,1,0,0,0,"cov1",0,0,
                 0,0,1,0,0,0,0,1),4,6,dimnames=list(c("mean_1","mean_2","sd_1","sd_2"),
                 c("mean_1:(Intercept)","mean_1:cov1","mean_2:(Intercept)","mean_2:cov1",
                 "sd_1:(Intercept)","sd_2:(Intercept)")))
Par0 <- getParDM(data,nbStates,list(step=stepDist,angle=angleDist),
                 Par=list(step=stepPar,angle=anglePar),
                 DM=list(step=stepDMm,angle=DMm$angle))
mDMmcov <- fitHMM(data=data,nbStates=nbStates,dist=list(step=stepDist,angle=angleDist),
              Par0=Par0,
              formula=formula,
              DM=list(step=stepDMm,angle=DMm$angle))

print(mDMmcov)

### 6. fit circular-circular angle mean covariate model to the simulated data using DM

# Generate fake circular covariate using circAngles
data$cov3 <- circAngles(refAngle=2*atan(rnorm(nrow(data))),data)

# Fit circular-circular regression model for angle mean
# Note no intercepts are estimated for angle means because these are by default
# the previous movement direction (i.e., a turning angle of zero)
mDMcircf <- fitHMM(data=data,nbStates=nbStates,dist=list(step=stepDist,angle=angleDist),
                 Par0=list(step=stepPar,angle=c(0,0,Par0$angle)),
                  formula=formula,
                  estAngleMean=list(angle=TRUE),
                  circularAngleMean=list(angle=TRUE),
                  DM=list(angle=list(mean=~cov3,concentration=~1)))
                  
print(mDMcircf)
                  
### 7. fit the exact same circular-circular angle mean model using DM matrices

# Note no intercept terms are included in DM for angle means because the intercept is
# by default the previous movement direction (i.e., a turning angle of zero)
mDMcircm <- fitHMM(data=data,nbStates=nbStates,dist=list(step=stepDist,angle=angleDist),
                 Par0=list(step=stepPar,angle=c(0,0,Par0$angle)),
                  formula=formula,
                  estAngleMean=list(angle=TRUE),
                  circularAngleMean=list(angle=TRUE),
                  DM=list(angle=matrix(c("cov3",0,0,0,0,"cov3",0,0,0,0,1,0,0,0,0,1),4,4)))
                  
print(mDMcircm)

### 8. Cosinor and state-dependent formulas
nbStates<-2
dist<-list(step="gamma")
Par<-list(step=c(100,1000,50,100))

# include 24-hour cycle on all transition probabilities
# include 12-hour cycle on transitions from state 2
formula=~cosinor(hour24,24)+state2(cosinor(hour12,12))

# specify appropriate covariates
covs<-data.frame(hour24=0:23,hour12=0:11)

beta<-matrix(c(-1.5,1,1,NA,NA,-1.5,-1,-1,1,1),5,2)
# row names for beta not required but can be helpful
rownames(beta)<-c("(Intercept)",
                  "cosinorCos(hour24, 24)",
                  "cosinorSin(hour24, 24)",
                  "cosinorCos(hour12, 12)",
                  "cosinorSin(hour12, 12)")
data.cos<-simData(nbStates=nbStates,dist=dist,Par=Par,
                      beta=beta,formula=formula,covs=covs)    

m.cosinor<-fitHMM(data.cos,nbStates=nbStates,dist=dist,Par0=Par,formula=formula)
m.cosinor    

### 9. Piecewise constant B-spline on step length mean and angle concentration
nObs <- 1000 # length of simulated track
cov <- data.frame(time=1:nObs) # time covariate for splines
dist <- list(step="gamma",angle="vm")
stepDM <- list(mean=~bSpline(time,df=3,degree=0),sd=~1)
angleDM <- list(mean=~1,concentration=~bSpline(time,df=3,degree=0))
DM <- list(step=stepDM,angle=angleDM)
Par <- list(step=c(log(1000),1,-1,log(100)),angle=c(0,log(10),2,-5))

data.spline<-simData(obsPerAnimal=nObs,nbStates=1,dist=dist,Par=Par,DM=DM,covs=cov) 

Par0 <- list(step=Par$step,angle=Par$angle[-1])
m.spline<-fitHMM(data.spline,nbStates=1,dist=dist,Par0=Par0,
                 DM=list(step=stepDM,
                         angle=list(concentration=~bSpline(time,df=3,degree=0))))  
  
### 10. Initial state (delta) based on covariate                       
nObs <- 100
dist <- list(step="gamma",angle="vm")
Par <- list(step=c(100,1000,50,100),angle=c(0,0,0.01,0.75))

# create sex covariate
cov <- data.frame(sex=factor(rep(c("F","M"),each=nObs))) # sex covariate
formulaDelta <- ~ sex + 0

# Female begins in state 1, male begins in state 2
delta <- matrix(c(-100,100),2,1,dimnames=list(c("sexF","sexM"),"state 2")) 

data.delta<-simData(nbAnimals=2,obsPerAnimal=nObs,nbStates=2,dist=dist,Par=Par,
                    delta=delta,formulaDelta=formulaDelta,covs=cov) 
       
Par0 <- list(step=Par$step, angle=Par$angle[3:4])   
m.delta <- fitHMM(data.delta, nbStates=2, dist=dist, Par0 = Par0, 
                  formulaDelta=formulaDelta)
# }
# NOT RUN {
# }

Run the code above in your browser using DataLab