closedpCI: Customization of a Loglinear Model and Confidence Interval for Abundance Estimation in Closed Population Capture-Recapture Experiments

Description

The closedpCI.t and closedpCI.0 functions fit a loglinear model specified by the user and compute a confidence interval for the abundance estimation. For a normal heterogeneous model, a log-transformed confidence interval (Chao 1987) is produced. For any other model, the multinomial profile likelihood confidence interval (Cormack 1992) is produced. The model is identified with the argument m or mX. For heterogeneous models, the form of the heterogeneity is specified with the arguments h and h.control. If h is given with mX, heterogeneity is added in mX. These functions extend closedp.t and closedp.0 as they broaden the range of models one can fit and they compute confidence intervals. Unlike the closedp functions, it fits only one model at a time.

Usage

closedpCI.t(X, dfreq=FALSE, m=c("M0","Mt","Mh","Mth"), mX=NULL,
            h=NULL, h.control=list(), mname=NULL, alpha=0.05, 
            fmaxSupCL=3, ...)

closedpCI.0(X, dfreq=FALSE, dtype=c("hist","nbcap"), t=NULL, t0=NULL, 
            m=c("M0","Mh"), mX=NULL, h=NULL, h.control=list(), 
            mname=NULL, alpha=0.05, fmaxSupCL=3, ...)
			
## S3 method for class 'closedpCI':
print(x, \dots)

## S3 method for class 'closedpCI':
boxplot(x, main="Boxplots of Pearson Residuals", \dots)

## S3 method for class 'closedpCI':
plot(x, main="Scatterplot of Pearson Residuals", \dots)

plotCI(x.closedpCI, main="Profile Likelihood Confidence Interval", ...)

Arguments

The matrix of the observed capture histories (see Rcapture-package for a description of the accepted formats).

dfreq

A logical. By default FALSE, which means that X has one row per unit. If TRUE, it indicates that the matrix X contains frequencies in its last column.

dtype

A characters string, either "hist" or "nbcap", to specify the type of data. "hist", the default, means that X contains complete observed capture histories. "nbca

Requested only if dtype="nbcap". A numeric specifying the total number of capture occasions in the experiment. For closedpCI.0, the value t=Inf is accepted. It indicates that captures occur

A numeric. Models are fitted considering only the frequencies of units captured 1 to t0 times. By default, if t is not equal to Inf, t0=t. When t=Inf, the default value

A character string indicating the model to fit. For closedpCI.0 it can be either "M0"=M0 model or "Mh"=Mh model. For closedpCI.t it can also be "Mt"=Mt model or "Mth"=Mth model.

The design matrix of the loglinear model. By default, the design matrix is built based on the m argument. If a mX argument is given, it must be a matrix (or an object that can be coerced to a matrix

A character string ("Chao", "LB", "Poisson", "Darroch", "Gamma" or "Normal") or a numerical R function specifying the form of the column(s) for heterogeneity in the design matrix. "Chao" and "LB" represents Chao's

h.control

A list of elements to control the heterogeneous part of the model, if any (see Details).

mname

A character string specifying the name of the customized model. By default, it is derived from the arguments specifying the model.

alpha

A confidence interval with confidence level 1-alpha is constructed. The value of alpha must be between 0 and 1; the default is 0.05.

fmaxSupCL

A numeric. The upper end point of the interval to be searched by uniroot to find the upper bound of the multinomial profile likelihood confidence interval (Cormack 1992) is defined by fmaxSupCL<

An object, produced by a closedpCI function, to print or to plot.

x.closedpCI

An object, produced by a closedpCI function, to produce a plot of the multinomial profile likelihood for $N$.

main

A main title for the plot

...

Further arguments to be passed to optim, print.default, plot.default or boxplot.default.

Value

nThe number of captured units
tThe number of capture occasions in the data matrix X
t0For closedpCI.0 only: the value of the argument t0 used in the computations.
resultsA table containing the estimated population size and its standard error, the deviance, the number of degrees of freedom and the Akaike's information criterion. If the name of the model is followed by ** in this table, it means that the model did not converge. Therefore, the estimated population size for this model is questionable.
fitThe 'glm' object obtained from fitting the model except for normal heterogeneous models (h="Normal"). These models are not fitted with glm.fit. For them, fit is a list with the following elements: parameters: The matrix of parameters (loglinear coefficients + sigma parameter) estimates with their standard errors. varcov: The estimated variance-covariance matrix of the estimated parameters. y: The y vector used to fit the model. fitted.values: The model fitted values. initparam: The initial values for the parameters (loglinear coefficients + sigma parameter) used by optim. optim.out: The output produced by optim.
glm.warnNot produced for normal heterogeneous models (h="Normal"): A vector of character strings. If the glm.fit function generates one or more warnings when fitting the model, a copy of these warnings are stored in glm.warn. NULL if glm.fit did not produce any warnings.
neg.etaFor Chao's lower bound models only: the position of the eta parameters set to zero in the loglinear parameter vector, if any. If not NULL, the deviance and degrees of freedom of the fitted Chao's lower bound model cannot be used to conduct a likelihood ratio test to investigate whether a particular heterogeneous model is adequate (see Details).
CINot produced for normal heterogeneous models (h="Normal"): A table containing the abundance estimation and its multinomial profile likelihood confidence interval.
alpha1-the confidence level of the interval.
NCINot produced for normal heterogeneous models (h="Normal"): The x-coordinates for plot.closedpCI.t.
loglikCINot produced for normal heterogeneous models (h="Normal"): The y-coordinates for plot.closedpCI.t.

Details

The closedpCI.t function fits models using the frequencies of the observable capture histories (vector of size $2^t-1$), whereas closedpCI.0 uses the number of units capture i times, for $i=1,\ldots,t$ (vector of size $t$). Thus, closedpCI.0 can be used with data sets larger than those for closedpCI.t, but it cannot fit models with a temporal effect. These functions do not work for closed population models featuring a behavioral effect, such as Mb and Mbh. The abundance estimation is calculated as the number of captured units plus the exponential of the Poisson regression intercept. However, models with a behavioral effect can by fitted with closedp.t (Mb and Mbh), closedp.Mtb and closedp.bc. CHAO'S LOWER BOUND MODELS Chao's lower bound models estimate a lower bound for the abundance. Rivest (2011) presents a generalized loglinear model underlying this estimator. To test whether a certain model for heterogeneity is adequate, one can conduct a likelihood ratio test by subtracting the deviance of a Chao's lower bound model to the deviance of the heterogeneous model under study. The two models should have the same mX argument. Under the null hypothesis of equivalence between the two models, the difference of deviances follows a chi-square distribution with degrees of freedom equal to the difference between the models' degrees of freedom. A Chao's lower bound model contains $t-2$ parameters, called eta parameters, for the heterogeneity. These parameters should theoretically be greater or equal to zero (see Rivest and Baillargeon (2007)). When the element neg of the argument h.control is set to TRUE (the default), negative eta parameters are set to zero (to do so, columns are removed from the design matrix of the model). Consequently, heterogeneous models are no longer particular cases of Chao's lower bound model. Therefore, likelihood ratio tests cannot be conducted anymore to test whether a chosen heterogeneous model is adequate. Also, the degrees of freedom of Chao's model increase when eta parameters are set to zero. ARGUMENT mX : formula specification For the closedpCI.t function, mX can be an object of class "formula". The only accepted variables in this formula are c1 to ct. The variable ci represents a capture indicator (1 for a capture, 0 otherwise) for the $i$th capture occasions. Also, the formula must not contain a response variable since it is only used to construct the design matrix of the model. For example, if t=3, the Mt model is fitted if mX = ~ . or mX = ~ c1 + c2 + c3. The symbol . in this formula is a shortcut for c1 + c2 + ... + ct. Formula mX arguments facilitate the addition of interactions between capture occasions in the model. For example, if t=3, the Mt model with an interaction between the first and the second capture occasion is fitted if mX = ~ . + c1:c2. See formula for more details of allowed formulae. ARGUMENT h.control The h.control argument is a list to supply any of the following elements to control the heterogeneous part of the model, if any. For a Poisson or Gamma heterogeneous model: [object Object] For a Chao's lower bound heterogeneous model: [object Object] For a Normal heterogeneous model: [object Object],[object Object],[object Object] PLOT METHODS AND FUNCTIONS The boxplot.closedpCI function produces a boxplot of the Pearson residuals of the customized model. The plot.closedpCI function traces the scatterplot of the Pearson residuals in terms of $f_i$ (number of units captured i times) for the customized model. The plotCI function produces a plot of the multinomial profile likelihood for $N$. The value of N maximizing the profile likelihood and the bounds of the confidence interval are identified.

References

Baillargeon, S. and Rivest, L.P. (2007) Rcapture: Loglinear models for capture-recapture in R. Journal of Statistical Software, 19(5), http://www.jstatsoft.org/v19/i05. Chao, A. (1987) Estimating the population size for capture-recapture data with unequal catchability. Biometrics, 43(4), 783--791. Cormack, R. M. (1992) Interval estimation for mark-recapture studies of closed populations. Biometrics, 48, 567--576. Rivest, L.P. (2011) A lower bound model for multiple record systems estimation with heterogeneous catchability. The International Journal of Biostatistics, 7(1), Article 23. Rivest, L.P. and Baillargeon, S. (2007) Applications and extensions of Chao's moment estimator for the size of a closed population. Biometrics, 63(4), 999--1006.

Examples

Run this code

data(hare)
CI<-closedpCI.t(hare, m = "Mth", h = "Poisson", h.control = list(theta = 2))
CI
plotCI(CI)

data(HIV)
mat<-histpos.t(4)
mX2<-cbind(mat,mat[,1]*mat[,2])
closedpCI.t(HIV, dfreq = TRUE, mX = mX2, mname = "Mt interaction 1,2")
# which can be obtained more conveniently with
closedpCI.t(HIV, dfreq = TRUE, mX = ~ . + c1:c2, mname = "Mt interaction 1,2")

data(BBS2001)
CI0<-closedpCI.0(BBS2001, dfreq = TRUE, dtype = "nbcap", t = 50, t0 = 20,
                 m = "Mh", h = "Gamma", h.control = list(theta = 3.5))
CI0
plot(CI0)
plotCI(CI0)

### As an alternative to a gamma model, one can fit a negative Poisson model.
### It is appropriate in experiments where very small capture probabilities
### are likely. It can lead to very large estimators of abundance. 
data(mvole)
period3 <- mvole[, 11:15]
psi <- function(x) { 0.5^x - 1 }
closedpCI.t(period3, m = "Mh", h = psi)

### Example of normal heterogeneous models
### diabetes data of Bruno et al. (1994)
histpos <- histpos.t(4)
diabetes <- cbind(histpos, c(58,157,18,104,46,650,12,709,14,20,7,74,8,182,10))
# chosen interaction set I in Rivest (2011)
closedpCI.t(X=diabetes, dfreq=TRUE, mX= ~ . + c1:c3 + c2:c4 + c3:c4, 
            h="Normal", mname="Mth normal with I")

### Example of captures in continuous time
### Illegal immigrants data
data(ill)
closedpCI.0(ill, dtype="nbcap", dfreq=TRUE, t=Inf, m="Mh", h="LB")

Run the code above in your browser using DataLab