pkm: Fit pk searcher efficiency models.

Description

Searcher efficiency is modeled as a function of the number of times a carcass has been missed in previous searches and any number of covariates. Format and usage parallel that of common R functions lm, glm, and gam. However, the input data (data) is structured differently to accommodate the multiple-search searcher efficiency trials (see Details), and model formulas may be entered for both p (akin to an intercept) and k (akin to a slope).

Usage

pkm(formula_p, formula_k = NULL, data, obsCol = NULL, kFixed = NULL,
  allCombos = FALSE, sizeCol = NULL, CL = 0.9, kInit = 0.7,
  quiet = FALSE, ...)
pkm0(formula_p, formula_k = NULL, data, obsCol = NULL, kFixed = NULL,
  kInit = 0.7, CL = 0.9, quiet = FALSE)
pkmSet(formula_p, formula_k = NULL, data, obsCol = NULL,
  kFixed = NULL, kInit = 0.7, CL = 0.9, quiet = FALSE)
pkmSize(formula_p, formula_k = NULL, data, kFixed = NULL,
  obsCol = NULL, sizeCol = NULL, allCombos = FALSE, kInit = 0.7,
  CL = 0.9, quiet = FALSE)

Arguments

formula_p

Formula for p; an object of class formula (or one that can be coerced to that class): a symbolic description of the model to be fitted. Details of model specification are given under "Details".

formula_k

Formula for k; an object of class formula (or one that can be coerced to that class): a symbolic description of the model to be fitted. Details of model specification are given under "Details".

data

Data frame with results from searcher efficiency trials and any covariates included in formula_p or formula_k (required).

obsCol

Vector of names of columns in data where results for each search occasion are stored (optional). If obsCol is not provided, pkm uses as obsCol all columns with names that begin with an "s" or "S" and end with a number, e.g., "s1", "s2", "s3", etc. This option is included as a convenience for the user, but care must be taken that other data are not stored in columns with names matching that pattern. Alternatively, obsCol may be entered as a vector of names, like c("s1", "s2", "s3"), paste0("s", 1:3), or c("initialSearch", "anotherSearch", "lastSearch"). The columns must be in chronological order, that is, it is assumed that the first column is for the first search after carcass arrival, the second column is for the second search, etc.

kFixed

Parameter for user-specified k value (optional). If a value is provided, formula_k is ignored and the model is fit under the assumption that the k parameter is fixed and known to be kFixed $\in [0, 1]$. If a sizeCol is provided, kFixed may either be NULL, a single number in [0, 1], or a vector with kFixed values for two or more of the carcass size classes. For example, if there are three sizes (S, M, and L), kFixed could be c(S = 0.3, M = 0.8, L = 1.0) to assign fixed k values to each size. To fit k for size S and to assign values of 0.8 and 1.0 to sizes M and L, resp., use kFixed = c(S = 0.3, M = 0.8, L = 1.0). If there are more than one size classes and kFixed is a scalar, then all size classes are assigned the same kFixed value (unless kFixed is named, e.g., kFixed = c(S = 0.5), in which case only the named size is assigned the kFixed).

allCombos

logical. If allCombos = FALSE, then the single model expressed by formula_p and formula_k is fit using a call to pkm0. If allCombos = TRUE, a full set of pkm submodels derived from combinations of the given covariates for p and k is fit. For example, submodels of formula_p = p ~ A * B would be p ~ A * B, p ~ A + B, p ~ A, p ~ B, and p ~ 1. Models for each pairing of a p submodel with a k submodel are fit via pkmSet, which fits each model combination using successive calls to pkm0, which fits a single model.

sizeCol

character string. The name of the column in data that gives the carcass class of the carcasses in the field trials. If sizeCol = NULL, then models are not segregated by size. If a sizeCol is provided, then separate models are fit for the data subsetted by sizeCol.

numeric value in (0, 1). confidence level

kInit

numeric value in (0, 1). Initial value used for numerical optimization of k. Default is kInit = 0.7. It is rarely (if ever) necessary to use an alternative initial value.

quiet

Logical indicator of whether or not to print messsages

...

additional arguments passed to subfunctions

Value

an object of an object of class pkm, pkmSet, pkmSize, or pkmSetSize.

pkm0()

returns a pkm object, which is a description of a single, fitted pk model. Due to the large number and complexity of components of apkm model, only a subset of them is printed automatically; the rest can be viewed/accessed via the $ operator if desired. These are described in detail in the 'pkm Components' section.

pkmSet()

returns a list of pkm objects, one for each of the submodels, as described with parameter allCombos = TRUE.

pkmSize()

returns a list of pkmSet objects (one for each 'size') if allCombos = T, or a list of pkm objects (one for each 'size') if allCombos = T

pkm

returns a pkm, pkmSet, pkmSize, or pkmSetSize object:

pkm object if allCombos = FALSE, sizeCol = NULL
pkmSet object if allCombos = TRUE, sizeCol = NULL
pkmSize object if allCombos = FALSE, sizeCol != NULL
pkmSetSize object if allCombos = TRUE, sizeCol != NULL

<code>pkm</code> Components

The following components of a pkm object are displayed automatically:

call: the function call to fit the model
formula_p: the model formula for the p parameter
formula_k: the model formula for the k parameter
predictors: list of covariates of p and/or k
AICc: the AIC value as corrected for small sample size
convergence: convergence status of the numerical optimization to find the maximum likelihood estimates of p and k. A value of 0 indicates that the model was fit successfully. For help in deciphering other values, see optim.
cell_pk: summary statistics for estimated cellwise estimates of p and k, including the number of carcasses in each cell, medians and upper & lower bounds on CIs for each parameter, indexed by cell (or combination of covariate levels).

The following components are not printed automatically but can be accessed via the $ operator:

data: the data used to fit the model
data0: $data with NA rows removed
betahat_p, betahat_k: parameter estimates for the terms in the regression model for for p and k (logit scale). If k is fixed or not provided, then betahat_k is not calculated.
varbeta: the variance-covariance matrix of the estimators for c(betahat_p, betahat_k).
cellMM_p, cellMM_k: cellwise model (design) matrices for covariate structures of p_formula and k_formula
levels_p, levels_k: all levels of each covariate of p and k
nbeta_p, nbeta_k: number of parameters to fit the p and k models
cells: cell structure of the pk-model, i.e., combinations of all levels for each covariate of p and k. For example, if covar1 has levels "a", "b", and "c", and covar2 has levels "X" and "Y", then the cells would consist of a.X, a.Y, b.X, b.Y, c.X, and c.Y.
ncell: total number of cells
predictors_k, predictors_p: covariates of p and k
observations: observations used to fit the model
kFixed: the input kFixed
AIC: the AIC value for the fitted model
carcCells: the cell to which each carcass belongs
CL: the input CL
loglik: the log-liklihood for the maximum likelihood estimate
pOnly: a logical value telling whether k is included in the model. pOnly = TRUE if and only if length(obsCol) == 1) and kFixed = NULL
data_adj: data0 as adjusted for the 2n fix to accommodate scenarios in which all trial carcasses are either found or all are not found on the first search occasion (uncommon)
fixBadCells: vector giving the names of cells adjusted for the 2n fix

Advanced

pkmSize may also be used to fit a single model for each carcass class if allCombos = FALSE. To do so, formula_p and formula_k must be a named list of formulas with names matching the sizes listed in unique(data[, sizeCol]). The return value is then a list of pkm objects, one for each size.

Details

The probability of finding a carcass that is present at the time of search is p on the first search after carcass arrival and is assumed to decrease by a factor of k each time the carcass is missed in searches. Both p and k may depend on covariates such as ground cover, season, species, etc., and a separate model format (formula_p and formula_k) may be entered for each. The models are entered as they would be in the familiar lm or glm functions in R. For example, p might vary with A and B, while k varies only with A. A user might then enter p ~ A + B for formula_p and k ~ A for formula_k. Other R conventions for defining formulas may also be used, with A:B for the interaction between covariates A and B and A * B as short-hand for A + B + A:B.

Search trial data must be entered in a data frame with data in each row giving the fate of a single carcass in the field trials. There must be a column for each search occassion, with 0, 1, or NA depending on whether the carcass was missed, found, or not available (typically because it was found and removed on a previous search, had been earlier removed by scavengers, or was not searched for) on the given search occasion. Additional columns with values for categorical covariates (e.g., visibility = E, M, or D) may also be included.

When all trial carcasses are either found on the first search or are missed on the first search after carcass placement, pkm effects a necessary adjustment to the for accuracy; otherwise, the model would not be able to determine the uncertainty and would substantially over-estimate the variance of the parameter estimates, giving $\hat{p}$ essentially equal to 0 or 1 with approximately equal probability. The adjustment is to fit the model on an adjusted data set with duplicated copies of the original data (2n observations) but with one carcass having the opposite fate of the others. For example, in field trials with very high searcher efficiency and n = 10 carcasses, all of which are found in the first search after carcass placement, the original data set would have a carcass observation column consisting of 1s (rep(1, 10)). The adjusted data set would have an observation column consisting of 2n - 1 1s and one 0. In this case, the point estimate of p is 1/(2n) with distribution that closely resembling the Bayesian posterior distributions of p with a uniform or a Jeffreys prior. The adjustment is applied on a cellwise basis in full cell models (e.g., 1, A, B, A * B). In the additive model with two predictors (A + B), the adjustment is made only when a full level of covariate A or B is all 0s or 1s.

Examples

Run this code

# NOT RUN {
 head(data(wind_RP))
 mod1 <- pkm(formula_p = p ~ Season, formula_k = k ~ 1, data = wind_RP$SE)
 class(mod1)
 mod2 <- pkm(formula_p = p ~ Season, formula_k = k ~ 1, data = wind_RP$SE,
   allCombos = TRUE)
 class(mod2)
 names(mod2)
 class(mod2[[1]])
 mod3 <- pkm(formula_p = p ~ Season, formula_k = k ~ 1, data = wind_RP$SE,
   allCombos = TRUE, sizeCol = "Size")
 class(mod3)
 names(mod3)
 class(mod3[[1]])
 class(mod3[[1]][[1]])

# }

Run the code above in your browser using DataLab