Function that runs a Bayesian sampler estimating parameters of an eDITH model fitted on both eDNA and direct sampling data.
run_eDITH_BT_joint(data, river, covariates = NULL, Z.normalize = TRUE,
use.AEM = FALSE, n.AEM = NULL, par.AEM = NULL,
no.det = FALSE, ll.type = "norm", source.area = "AG",
mcmc.settings = NULL, likelihood = NULL, prior = NULL,
sampler.type = "DREAMzs",
tau.prior = list(spec="lnorm",a=0,b=Inf, meanlog=log(5),
sd=sqrt(log(5)-log(4))),
log_p0.prior = list(spec="unif",min=-20, max=0),
beta.prior = list(spec="norm",sd=1),
sigma.prior = list(spec="unif",min=0,
max=max(data$values[data$type=="e"],
na.rm = TRUE)),
omega.prior = list(spec="unif",min=1,
max=10*max(data$values[data$type=="e"],
na.rm = TRUE)),
Cstar.prior = list(spec="unif",min=0,
max=max(data$values[data$type=="e"],
na.rm = TRUE)),
omega_d.prior = list(spec="unif",min=1,
max=10*max(c(0.11, data$values[data$type=="d"]),
na.rm = TRUE)),
alpha.prior = list(spec="unif", min=0, max=1e6),
verbose = FALSE)
A list with objects:
Vector of named parameters corresponding to the maximum a posteriori estimate. It is the
output of the call to MAP
.
Vector of best-fit eDNA production rates corresponding to the maximum a posteriori parameter
estimate param_map
. It has length equal to river$AG$nNodes
.
Vector of best-fit eDNA values (in the same unit as data$values
, i.e. concentrations or read numbers)
corresponding to the maximum a posteriori parameter estimate param_map
. It has length equal to river$AG$nNodes
.
Vector of best-fit detection probabilities corresponding to the maximum a posteriori
parameter estimate param_map
. It has length equal to river$AG$nNodes
. If a custom likelihood
is provided,
this is a vector of null length (in which case the user should calculate the probability of detection independently, based on
the chosen likelihood).
Output of the call to getCredibleIntervals
.
Output of the call to gelmanDiagnostics
.
Data frame containing input covariate values (possibly Z-normalized).
Vector of source area values.
Object of class mcmcSampler
returned by the call to runMCMC
.
Moreover, arguments ll.type
(possibly changed to "custom"
if a custom likelihood is specified), no.det
and data
are added to the list.
eDNA and direct observation data. Data frame containing columns ID
(index of the AG node/reach where
the sample was taken), values
(value of the eDNA or direct measurement) and type
(equal to "e"
for eDNA data and to "d"
for direct observation data). eDNA values
are expressed as concentration or number of reads; direct observations are expressed as numbers of individuals.
A river
object generated via aggregate_river
.
Data frame containing covariate values for all river
reaches. If NULL
(default
option), production rates are estimated via AEMs.
Logical. Should covariates be Z-normalized?
Logical. Should eigenvectors based on AEMs be used as covariates? If covariates = NULL
, it is
set to TRUE
. If TRUE
and covariates
are provided, AEM eigenvectors are appended to the
covariates
data frame.
Number of AEM eigenvectors (sorted by the decreasing respective eigenvalue) to be used as covariates. If
par.AEM$moranI = TRUE
, this parameter is not used. Instead, the eigenvectors with significantly positive spatial
autocorrelation are used as AEM covariates.
List of additional parameters that are passed to river_to_AEM
for calculation of AEMs.
In particular, par.AEM$moranI = TRUE
imposes the use of AEM covariates with significantly positive spatial
autocorrelation based on Moran's I statistic.
Logical. Should a probability of non-detection be included in the model?
Character. String defining the error distribution used in the log-likelihood formulation.
Allowed values are norm
(for normal distribution), lnorm
(for lognormal distribution),
nbinom
(for negative binomial distribution) and geom
(for geometric distribution). The two latter choices
are suited when eDNA data are expressed as read numbers, while norm
and lnorm
are better suited
to eDNA concentrations.
Defines the extent of the source area of a node. Possible values are "AG"
(if the source
area is the reach surface, i.e. length*width), "SC"
(if the source area is the subcatchment area), or,
alternatively, a vector with length river$AG$nodes
.
List. It is passed as argument settings
in runMCMC
. Default is
list(iterations = 2.7e6, burnin=1.8e6, message = TRUE, thin = 10).
Likelihood function to be passed as likelihood
argument to createBayesianSetup
.
If not specified, it is generated based on arguments no.det
and ll.type
. If a custom likelihood is specified,
a custom prior
must also be specified.
Prior function to be passed as prior
argument to createBayesianSetup
.
If not specified, it is generated based on the *.prior
arguments provided. If a user-defined prior is provided,
parameter names must be included in prior$lower
, prior$upper
(see example).
Character. It is passed as argument sampler
in runMCMC
.
List that defines the prior distribution for the decay time parameter tau
. See details.
List that defines the prior distribution for the logarithm (in base 10) of the baseline production rate
p0
. See details. If covariates = NULL
, this defines the prior distribution for the logarithm (in base 10) of
production rates for all river
reaches.
List that defines the prior distribution for the covariate effects beta
. See details. If a single
spec
is provided, the same prior distribution is specified for all beta
parameters. Alternatively, if
spec
(and the other arguments, if provided) is a vector with length equal to the number of covariates included,
different prior distributions can be specified for the different beta
parameters.
List that defines the prior distribution for the standard deviation of the measurement error
when ll.type
is "norm"
or "lnorm"
. It is not used if ll.type = "nbinom"
. See details.
List that defines the prior distribution for the overdispersion parameter omega
of the
measurement error when ll.type = "nbinom"
. It is not used if ll.type
is "norm"
or "lnorm"
.
See details.
List that defines the prior distribution for the Cstar
parameter controlling the probability
of no detection. It is only used if no.det = TRUE
. See details.
Prior distribution for the overdispersion parameter for direct sampling density observations.
Prior distribution for the inverse DNA shedding rate (i.e., the organismal density that sheds a unit eDNA value per unit time).
Logical. Should console output be displayed?
The arguments of the type *.prior
consist in the lists of arguments required by dtrunc
(except the first argument x
).
By default, AEMs are computed without attributing weights to the edges of the river network.
Use e.g. par.AEM = list(weight = "gravity")
to attribute weights.
data(wigger)
data(dataCD)
# reduce number of iterations for illustrative purposes
# (use default mcmc.settings to ensure convergence)
settings.short <- list(iterations = 1e3, thin = 10)
set.seed(1)
out <- run_eDITH_BT_joint(dataCD, wigger, mcmc.settings = settings.short)
# \donttest{
library(rivnet)
# best-fit (maximum a posteriori) map of eDNA production rates
plot(wigger, out$p_map)
# best-fit map (maximum a posteriori) of detection probability
plot(wigger, out$probDet_map)
# compare best-fit vs observed values
data.e <- which(dataCD$type=="e")
data.d <- which(dataCD$type=="d")
plot(out$C_map[dataCD$ID[data.e]], dataCD$values[data.e],
xlab="Modelled (MAP) eDNA concentrations", ylab="Observed eDNA concentrations")
abline(a=0, b=1)
plot(out$p_map[dataCD$ID[data.d]], dataCD$values[data.d],
xlab="Modelled (MAP) eDNA production rate", ylab="Observed density data")
# }
Run the code above in your browser using DataLab