F.cjs.estim: F.cjs.estim - Cormack-Jolly-Seber estimation

Description

Estimates Cormack-Jolly-Seber (CJS) capture-recapture models with individual, time, and individual-time varying covariates using the "regression" parametrization of Amstrup et al (2006, Ch 9). For live recaptures only. Losses on capture allowed. Uses a logistic link function to relate probability of capture and survival to external covariates.

Usage

F.cjs.estim(capture, survival, histories, cap.init, sur.init, group,  nhat.v.meth = 1, 
c.hat = -1, df = NA, intervals=rep(1,ncol(histories)-1), conf=0.95, link="logit", control=mra.control())

Arguments

capture

Formula specifying the capture probability model. Must be a formula object with no response. I.e., "~" followed by the names of 2-D arrays of covariates to fit in the capture model. For example: 'capture = ~ age + sex', where age and sex are matri

survival

Formula specifying the survival probability model. Must be a formula object with no response. I.e., "~" followed by the names of 2-D arrays of covariates to fit in the survival model. For example: 'survival = ~ year + ageclass' where year and agecl

histories

A NAN X NS = (number of animals) X (number of capture occasions) matrix containing capture histories. Capture histories are comprised of 0's, 1',s and 2's. 0 in cell (i,j) means animal i was not captured on occasion j, 1 in cell (i,j) means animal

cap.init

(optional) Vector of initial values for coefficients in the capture model. One element per covariate in capture. The default value usually works.

sur.init

(optional) Vector or initial values for coefficients in the survival model. One element per covariate in survival. The default value usually works.

group

(optional) A vector of length NAN giving the (non-changing) group membership of every captured animal (e.g., sex). Group is used only for computing TEST 2 and TEST 3. TEST 2 and TEST 3 are computed separately for each group. E.g., if group=sex, T

nhat.v.meth

Integer specifying method for computing variance estimates of population size estimates. nhat.v.meth = 1 uses the variance estimator of Taylor et al. 2002, Ursus, p. 188 which is the so-called Huggins variance estimator, and incorporate

c.hat

External (override) estimate of variance inflation factor (c.hat) to use during estimation. If input value of c.hat is

External (override) model degrees of freedom to use during estimation. If df == NA, the number of parameters is estimated from the rank of the matrix of 2nd derivatives or Hessian, depending on cov.meth parameter. If <

intervals

Time intervals. This is a vector of length ncol(histories)-1 (i.e., number of capture occasions minus 1) specifying relative time intervals between occasions. For example, if capture occasions occurred in 1999, 2000, 2005, and 2007

conf

Confidence level for the confidence intervals placed around estimates of population size. Default 95% confidence.

link

The link function to be used. The link function converts linear predictors in the range (-infinity, infinity) to probabilities in the range (0,1). Valid values for the link function are "logit" (default), "sine", and "hazard". (see Examples

control

A list containing named control parameters for the minimization and estimation process. Control parameters include number of iterations, covariance estimation method, etc. Although the default values work in the vast majority of cases, change

Value

An object (list) of class c("cjs","cr") with many components. Use print.cr to print it nicely. Use names(fit), where the call was fit <- F.cr.estim(...), to see names of all returned components. To see values of individual components, issue commands like fit$s.hat, fit$se.s.hat, fit$n.hat, etc. Components of the returned object are as follows:
historiesThe input capture history matrix.
aux
Auxiliary information about the fit, mostly stored input values. This is a list containing the following components:
- call = original call
- nan = number of animals
- ns = number of samples = number of capture occasions
- nx = number of coefficients in capture model
- ny = number of coefficients in survival model
- cov.name = names of all coefficients
- ic.name = name of capture history matrix.
- mra.version = version number of MRA package used to estimate the model
- R.version = R version used for during estimation
- run.date = date the model was estimated.
loglikMaximized CJS likelihood value for the model
devianceModel deviance = -2*loglik. This is relative deviance, see help for F.sat.lik.
aicAIC for the model = deviance + 2*(df). df is either the estimated number of independent parameters (by default), or NX+NY, or a specified value, depending on the input value of DF parameter.
qaicQAIC (quasi-AIC) = (deviance / vif) + 2(df)
aiccAIC with small sample correction = AIC + (2*df*(df+1)) / (nan - df - 1)
qaiccQAIC with small sample correction = QAIC + (2*df*(df+1))/(nan - df - 1)
vifVariance inflation factor used = estimate of c.hat = chisq.vif / chisq.df
chisq.vifComposite Chi-square statistic from Test 2 and Test 3 used to compute vif, based on pooling rules.
vif.dfDegrees of freedom for composite chi-square statistic from Test 2 and Test 3, based on pooling rules.
parametersVector of all coefficient estimates, NX capture probability coefficients first, then NY survival coefficients. This vector is length NX+NY regardless of estimated DF.
se.paramStandard error estimates for all coefficients. Length NX+NY.
capcoefVector of coefficients in the capture model. Length NX.
se.capcoefVector of standard errors for coefficients in capture model. Length NX.
surcoefVector of coefficients in the survival model. Length NY.
se.surcoefVector of standard errors for coefficients in survival model. Length NY.
covarianceVariance-covariance matrix for the estimated model coefficients. Size (NX+NY) X (NX+NY).
p.hatMatrix of estimated capture probabilities computed from the model. One for each animal each occasion. Cell (i,j) is estimated capture probability for animal i during capture occasion j. Size NAN X NS. First column corresponding to first capture probability is NA because cannot estimate P1 in a CJS model.
se.p.hatMatrix of standard errors for estimated capture probabilities. One for each animal each occasion. Size NAN X NS. First column is NA.
s.hatMatrix of estimated survival probabilities computed from the model. One for each animal each occasion. Size NAN X NS. Cell (i,j) is estimated probability animal i survives from occasion j to j+1. There are only NS-1 intervals between occasions. Last column corresponding to survival between occasion NS and NS+1 is NA.
se.s.hatMatrix of standard errors for estimated survival probabilities. Size NAN X NS. Last column is NA.
dfNumber of estimable parameters in the model. df is either the estimated number of independent parameters (by default) based on rank of the variance matrix, or NX+NY, or a specified value, depending on the input value of DF parameter.
controlA list containing the input maximization and estimation control parameters.
messageA vector of strings interpreting various codes about the estimation. The messages interpret, in this order, the codes for (1) maximization algorithm used, (2) exit code from the maximization algorithm (interprets exit.code), and (3) covariance matrix code (interprets cov.code).
exit.code
Exit code from the maximization routine. Interpretation for exit.code is in message. Exit codes are as follows:
- exit.code = 0: FAILURE: Initial Hessian not positive definite.
- exit.code = 1: SUCCESS: Convergence criterion met.
- exit.code = 2: FAILURE: G'dX > 0, rounding error.
- exit.code = 3: FAILURE: Likelihood evaluated too many times.
- exit.code =-1: FAILURE: Unknown optimization algorithm."
cov.codeA code indicating the method used to compute the covariance matrix.
fn.evalsThe number of times the likelihood was evaluated prior to exit from the minimization routine. If exit.code = 3, fn.evals equals the maximum set in mra.control. This, in combination with the exit codes and execution time, can help detect non-convergence or bad behavior.
ex.timeExecution time for the maximization routine, in minutes. This is returned for 2 reasons. First, this is useful for benchmarking. Second, in conjunction with exit.code, cov.code, and fn.evals, this could be used to detect ill-behaved or marginally unstable problems, if you know what you are doing. Assuming maxfn is set high in mra.control() (e.g., 1000), if exit.code = 1 but the model takes a long time to execute relative to similarly sized problems, it could indicate unstable or marginally ill-behaved models.
n.hatVector of Horvitz-Thompson estimates of population size. The Horvitz-Thompson estimator of size is, $$\hat{N}_{ij} = \sum_{i=1}^{NAN} \frac{h_{ij}}{\hat{p}_{ij}}$$ Length of n.hat = NS. No estimate for first occasion.
se.n.hatEstimated standard errors for n.hat estimates. Computed using method specified in nhat.v.meth.
n.hat.lowerLower limit of n.hat.conf percent on n.hat. Length NS.
n.hat.upperUpper limit of n.hat.conf percent on n.hat. Length NS.
n.hat.confConfidence level of intervals on n.hat
nhat.v.methCode for method used to compute variance of n.hat
num.caughtVector of observed number of animals captured each occasion. Length NS.
fittedMatrix of fitted values for the capture histories. Size NAN X NS. Cell (i,j) is expected value of capture indicator in cell (i,j) of histories matrix.
residualsMatrix of Pearson residuals defined as, $$r_{ij} = \frac{(h_{ij} - \Psi_{ij})^2}{\Psi_{ij}}$$, where $\Psi_{ij}$ is the expected (or fitted) value for cell (i,j) and $h_{ij}$ is the capture indicator for animal i at occasion j. This matrix has size NAN X NS. See parts pertaining to the "overall test" in documentation for F.cjs.gof for a description of $\Psi_{ij}$.
resid.typeString describing the type of residuals computed. Currently, only Pearson residuals are returned.

Details

This is the work-horse routine for estimating CJS models. It compiles all the covariate matrices, then calls a Fortran routine to maximize the CJS likelihood and perform goodness-of-fit tests. Horvitz-Thompson-type population size estimates are also computed by default. A log file, named mra.log, is written to the current directory. This file contains additional details, such as individual Test 2 and Test 3 components, in a semi-friendly format. This file is overwritten each run. Model Specification: Both the capture and survival model can be specified as any combination of 2-d matrices (time and individual varying covariates), 1-d time varying vectors, 1-d individual varying vectors, 1-d time varying factors, and 1-d individual varying factors. Specification of time or individual varying effects uses the tvar (for 'time varying') and ivar (for 'individual varying') functions. These functions expand covariate vectors along the appropriate dimension to be 2-d matrices suitable for fitting in the model. ivar expands an individual varying vector to all occasions. tvar expands a time varying covariate to all individuals. To do the expansion, both tvar and ivar need to know the size of the 'other' dimension. Thus, tvar(x,100) specifies a 2-d matrix with size 100 by length(x). ivar(x,100) specifies a 2-d matrix with size length(x) by 100. For convenience, the 'other' dimension of time or individual varying covariates can be specified as an attribute of the vector. Assuming x is a NS vector and the 'nan' attribute of x has been set as attr(x,"nan") <- NAN, tvar(x,NAN) and tvar(x) are equivalent. Same, but vise-versa, for individual varying covariates (i.e., assign the number of occasions using attr(x,"ns")<-NS). This saves some typing in model specification. Factors are allowed in ivar and tvar. When a factor is specified, the contr.treatment coding is used. By default, an intercept is assumed and the first level of all factors are dropped from the model (i.e., first levels are the reference levels, the default R action). However, there are applications where more than one level will need to be dropped, and the user has control over this via the drop.levels argument to ivar and tvar. For example, tvar(x,drop.levels=c(1,2)) drops the first 2 levels of factor x. tvar(x,drop.levels=length(levels(x))) does the SAS thing and drops the last level of factor x. If drop.levels is outside the range [1,length(levels(x))] (e.g., negative or 0), no levels of the factor are dropped. If no intercept is fitted in the model, this results in the so-called cell means coding for factors. Example model specifications: Assume 'age' is a NAN x NS 2-d matrix of ages, 'effort' is a size NS 1-d vector of efforts, and 'sex' is a size NAN 1-d factor of sex designations ('M' and 'F').

capture= ~ 1 : constant effect over all individuals and time (intercept only model)
capture= ~ age : Intercept plus age
capture= ~ age + tvar(effort,NAN) : Intercept plus age plus effort
capture= ~ age + tvar(effort,NAN) + ivar(sex,NS) : Intercept plus age plus effort plus sex. Females (1st level) are the reference.
capture= ~ -1 + ivar(sex,NS,0) : sex as a factor, cell means coding
capture= ~ tvar(as.factor(1:ncol(histories)),nrow(histories),c(1,2)) : time varying effects

Values in 2-d Matrix Covariates: Even though covariate matrices are required to be NAN x NS (same size as capture histories), there are not that many parameters. The first capture probability cannot be estimated in CJS models, and the NS-th survival parameter does not exist. When a covariate matrix appears in the capture model, only values in columns 2:ncol(histories) are used. When a covariate matrix appears in the survival model, only values in columns 1:(ncol(histories)-1) are used. See examples for demonstration.

References

Taylor, M. K., J. Laake, H. D. Cluff, M. Ramsay, and F. Messier. 2002. Managing the risk from hunting for the Viscount Melville Sound polar bear population. Ursus 13:185-202. Manly, B. F. J., L. L. McDonald, and T. L. McDonald. 1999. The robustness of mark-recapture methods: a case study for the northern spotted owl. Journal of Agricultural, Biological, and Environmental Statistics 4:78-101. Huggins, R. M. 1989. On the statistical analysis of capture experiments. Biometrika 76:133-140. Amstrup, S. C., T. L. McDonald, and B. F. J. Manly (editors). 2005. Handbook of Capture-Recapture Analysis. Princeton University Press. Peterson. 1986. Statistics and Probability Letters. p.227. McDonald, T. L., and S. C. Amstrup. 2001. Estimation of population size using open capture-recapture models. Journal of Agricultural, Biological, and Environmental Statistics 6:206-220.

Examples

Run this code

## Fit CJS model to dipper data, time-varying capture and survivals.
## Method 1 : using factors
data(dipper.histories)
ct <- as.factor( paste("T",1:ncol(dipper.histories), sep=""))
attr(ct,"nan")<-nrow(dipper.histories)
dipper.cjs <- F.cjs.estim( ~tvar(ct,drop=c(1,2)), ~tvar(ct,drop=c(1,6,7)), dipper.histories )

## Method 2 : same thing using 2-d matrices
xy <- F.cjs.covars( nrow(dipper.histories), ncol(dipper.histories) )
for(j in 1:ncol(dipper.histories)){ assign(paste("x",j,sep=""), xy$x[,,j]) } # Extract 2-D matrices of 0s and 1s
dipper.cjs <- F.cjs.estim( ~x3+x4+x5+x6+x7, ~x2+x3+x4+x5, dipper.histories )

## Values in the 1st column of capture covariates do not matter
x3.a <- x3
x3.a[,1] <- 999
dipper.cjs2 <- F.cjs.estim( ~x3.a+x4+x5+x6+x7, ~x2+x3+x4+x5, dipper.histories )
# compare dipper.cjs2 to dipper.cjs

## Values in the last column of survival covariates do not matter
x3.a <- x3
x3.a[,ncol(dipper.histories)] <- 999
dipper.cjs2 <- F.cjs.estim( ~x3+x4+x5+x6+x7, ~x2+x3.a+x4+x5, dipper.histories )
# compare dipper.cjs2 to dipper.cjs


## A plot to compare the link functions
sine.link <- function(eta){ ifelse( eta < -4, 0, ifelse( eta > 4, 1, .5*(1+sin(eta*pi/8)))) }
eta <- seq(-5,5, length=40)
p1 <- 1 / (1 + exp(-eta))
p2 <- sine.link(eta)
p3 <- 1.0 - exp( -exp( eta ))
plot(eta, p1, type="l" )
lines(eta, p2, col="red" )
lines(eta, p3, col="blue" )
legend( "topleft", legend=c("logit", "sine", "hazard"), col=c("black", "red", "blue"), lty=1)

Run the code above in your browser using DataLab