The closedpMS.t
function fits every possible hierarchical loglinear models for a given
closed population capture-recapture data set, under the constraints set by the given maxorder
and forced
arguments. Parameters for heterogeneity in capture probabilites among units can be added to the models.
The getAllModels
function lists every possible hierarchical loglinear models for a certain number of capture occasions t
, under the constraints set by the given maxorder
and forced
arguments.
closedpMS.t(X, dfreq = FALSE, h = NULL, h.control = list(),
maxorder = t - 1, forced = 1:t, stopiflong = TRUE, …)
# S3 method for closedpMS
print(x, …)# S3 method for closedpMS
plot(x, main="Models comparison based on BIC", omitOutliers = TRUE, …)
getAllModels(t, maxorder = t - 1, forced = 1:t, stopiflong = TRUE)
The matrix of the observed capture histories (see Rcapture-package
for a
description of the accepted formats).
A logical. By default FALSE
, which means that X
has one row per unit.
If TRUE
, it indicates that the matrix X
contains frequencies in its last column.
A character string ("LB", "Chao", "Poisson", "Darroch", "Gamma" or "Normal") or a
numerical R
function specifying the form of the column(s) for heterogeneity
in the design matrix.
"LB" and "Chao" represents Chao's lower bound model (the default),
"Poisson" represents the function \(f(k)=theta^k-1\),
"Darroch" represents the function \(f(k)=k^2/2\),
and "Gamma" represents the function \(f(k)=-\log(theta + k) + \log(theta)\)
where \(k\) is the number of captures and \(theta\) is a parameter specified with
the h.control
element theta
.
"Normal" represents heterogeneity
modeled with a normal mixing distribution, as presented in Rivest (2011).
If an R
function is given, it must be the implementation of any convex
mathematical function \(f(k)\) (it must have only one argument).
A list of elements to control the heterogeneous part of the model, if any. For a Poisson or Gamma heterogeneous model:
theta
The value of the parameter \(theta\) in \(f(k)=theta^k-1\) for the Poisson model (the default value is 2) and in \(f(k)=-\log(theta + k) + \log(theta)\) for the Gamma model (the default value is 3.5).
For a Chao's lower bound heterogeneous model:
neg
If this option is set to TRUE
(the default), negative eta parameters in
Chao's lower bound models are set to zero.
For a Normal heterogeneous model:
initsig
Initial value for the sigma parameter of the normal mixing distribution.
The default is 0.2
.
method
The method to be used by optim
. The default is "BFGS"
.
A numeric specifying the higher order accepted for the terms in the models.
It can take a value between 1 and t - 1
(the default).
A vector of the terms forced in the model (by default every
first order terms). NULL
if no terms are forced in the model.
A logical indicating whether the function execution should be stopped
when the number of possibles models is larger than 10 000, that is when
the function might be long to run (defaut TRUE
).
Further arguments to be passed to glm
, optim
or print.default
.
An object, produced by the closedpMS.t
function, to print or to plot.
A main title for the plot.
A logical. If TRUE (the default), models with an outlier abundance or BIC value are removed from the plot. A value is considered an outlier if it is smaller than the first quartile minus 1.5 times the interquartile range, or larger than the third quartile plus 1.5 times the interquartile range.
A numeric specifying the number of capture occasions in the experiments.
It is deduced from the data set in the closedpMS.t
function.
It must be an integer between 1 and 9 inclusively.
getAllModels
returns a caracter vector with the models names.
closedpMS.t
returns a list
with the following elements:
The number of captured units
The total number of capture occasions in the data matrix X
.
When captures occur in continuous time (input argument t=Inf
),
this output value is the larger number of captures observed.
A table containing, for every fitted model:
abundance
the estimated population size,
stderr
the standard error of the estimated population size,
deviance
the model's deviance,
df
the number of degrees of freedom,
AIC
the Akaike's information criterion,
BIC
the bayesian information criterion,
infoFit
a numerical code giving information about error or warnings
encountered when fitting the model (see Rcapture-package
for details).
A vector, the asymptotic bias of the estimated population size for every fitted model.
A list of character string vectors. If an error occurs
while fitting a model (with glm
or optim
), a copy of the
error message is stored in fit.err$mname
where mname
is the
name of the hierarchical loglinear model. A NULL
list element means
that no error occured for the considered model.
A list of character string vectors. If warnings are generated
while fitting a model (with glm
or optim
), a copy of these
warnings are stored in fit.warn$mname
where mname
is a the
name of the hierarchical model. A NULL
list element means
that no warnings were generated for the considered model.
For Chao's lower bound models only: the position of the eta parameters set to zero in the loglinear parameter vector, if any.
HIERARCHICAL LOGLINEAR MODEL NAME SYNTAX
First, a model's term is written using numbers between 1 and 9 to represent the capture occasions it includes (ex.: 134
represents the three-way interaction c1:c3:c4
). This syntax limits the maximal number of capture occation to 9. This is not a problem since from 6 capture occasions upwards, the number of hierarchical models becomes very large and difficult to manage.
A hierarchical model name is a list of the model's terms at the top of the hierarchies in the model. These terms are separated by commas, without spaces. They are surronded by brackets. For example, "[123,34,5]"
is the name of the model
~ 1 + c1 + c2 + c3 + c4 + c5 + c1:c2 + c1:c3 + c2:c3 + c3:c4 + c1:c2:c3
.
Baillargeon, S. and Rivest, L.P. (2007) Rcapture: Loglinear models for capture-recapture in R. Journal of Statistical Software, 19(5), http://www.jstatsoft.org/v19/i05.
Chao, A. (1987) Estimating the population size for capture-recapture data with unequal catchability. Biometrics, 43(4), 783--791.
Cormack, R. M. (1992) Interval estimation for mark-recapture studies of closed populations. Biometrics, 48, 567--576.
Rivest, L.P. (2011) A lower bound model for multiple record systems estimation with heterogeneous catchability. The International Journal of Biostatistics, 7(1), Article 23.
Rivest, L.P. and Baillargeon, S. (2007) Applications and extensions of Chao's moment estimator for the size of a closed population. Biometrics, 63(4), 999--1006.
# NOT RUN {
# The lesbian data set contains 4 capture occasions.
# By default, closedpMB.t fits the 113 following models:
getAllModels(4)
closedpMS.t(lesbian, dfreq = TRUE)
# We could reduce the number of models by omitting
# those with triple interactions.
closedpMS.t(lesbian, dfreq = TRUE, maxorder = 2)
# Models with heterogeneity fits better.
Darr <- closedpMS.t(lesbian, dfreq = TRUE, h = "Darroch")
Darr
# The plot method allows the visualization of the results
# from models fitted by closedpMS.t().
plot(Darr)
# According to the BIC, the best heterogeneous Darroch model
# for this data set contains the double interactions 12, 13, 14.
# Here is the profile likelihood confidence interval for the
# abundance estimation from this model.
closedpCI.t(lesbian, dfreq = TRUE, mX = "[12,13,14]", h = "Darroch")
# }
Run the code above in your browser using DataLab