Estimation of prevalence based on presence/absence tests on pooled samples
PoolPrev(
data,
result,
poolSize,
...,
prior.alpha = NULL,
prior.beta = NULL,
prior.absent = 0,
level = 0.95,
verbose = FALSE,
cores = NULL,
iter = 2000,
warmup = iter/2,
chains = 4,
control = list(adapt_delta = 0.9)
)
A data.frame
with one row for each pooled sampled and
columns for the size of the pool (i.e. the number of specimens / isolates /
insects pooled to make that particular pool), the result of the test of the
pool. It may also contain additional columns with additional information
(e.g. location where pool was taken) which can optionally be used for
stratifying the data into smaller groups and calculating prevalence by
group (e.g. calculating prevalence for each location)
The name of column with the result of each test on each pooled sample. The result must be stored with 1 indicating a positive test result and 0 indicating a negative test result.
The name of the column with number of specimens/isolates/insects in each pool
Optional name(s) of columns with variables to stratify the data by. If omitted the complete dataset is used to estimate a single prevalence. If included, prevalence is estimated separately for each group defined by these columns
The default prior for the
prevalence is the uninformative Jeffrey's prior, however you can also
specify a custom prior with a beta distribution (with parameters
prior.alpha and prior.beta) modified to have a point mass of zero i.e.
allowing for some prior probability that the true prevalence is exactly
zero (prior.absent). Another popular uninformative choice is
prior.alpha = 1, prior.beta = 1, prior.absent = 0
, i.e. a uniform
prior.
Defines the confidence level to be used for the confidence and credible intervals. Defaults to 0.95 (i.e. 95% intervals)
Logical indicating whether to print progress to screen. Defaults to false (no printing to screen).
The number of CPU cores to be used. By default one core is used
MCMC options for passing onto the sampling routine. See stan for details.
A data.frame
with columns:
PrevMLE
(the
Maximum Likelihood Estimate of prevalence)
CILow
and
CIHigh
(Lower and Upper Confidence intervals using the Likelihood
Ratio method)
Bayesian Posterior Expectation
CrILow
and CrIHigh
Number of Pools
Number Positive
If grouping variables are provided in
...
there will be an additional column for each grouping variable.
When there are no grouping variables (supplied in ...
) then the
dataframe has only one row with the prevalence estimates for the whole
dataset. When grouping variables are supplied, then there is a separate row
for each group.