The function fits some flexible regression models for binomial data via a Bayesian approach to inference based on Hamiltonian Monte Carlo algorithm.
Available regression models are the flexible beta-binomial (type="FBB"), the beta-binomial ("type=BetaBin"), and the binomial one ("type=Bin").
flexreg_binom(
formula,
data,
type = "FBB",
n = NULL,
link.mu = "logit",
prior.beta = "normal",
hyperparam.beta = 100,
hyper.theta.a = NULL,
hyper.theta.b = NULL,
link.theta = NULL,
prior.psi = NULL,
hyperparam.psi = NULL,
n.iter = 5000,
burnin.perc = 0.5,
n.chain = 1,
thin = 1,
verbose = TRUE,
...
)The flexreg_binom function returns an object of class `flexreg`, i.e. a list with the following elements:
callthe function call.
formulathe original formula.
link.mua character specifing the link function in the mean model.
link.thetaa character specifing the link function in the overdispersion model.
modelan object of class `stanfit` containing the fitted model.
responsethe response variable, assuming values in (0, 1).
design.Xthe design matrix for the mean model.
design.Zthe design matrix for the overdispersion model (if defined).
an object of class `formula`: a symbolic description of the model to be fitted (of type y ~ x or y ~ x | z).
an optional data frame, list, or object that is coercible to a data frame through base::as.data.frame containing the variables in the model. If not found in data, the variables in formula are taken from the environment from which the function flexreg is called.
a character specifying the type of regression model. Current options are the flexible beta-binomial "FBB" (default), the beta-binomial "BetaBin", and the binomial one "Bin".
the total number of trials.
a character specifying the link function for the mean model (mu). Currently, "logit" (default), "probit", "cloglog", and "loglog" are supported.
a character specifying the prior distribution for the beta regression coefficients of the mean model. Currently, "normal" (default) and "cauchy" are supported.
a positive numeric (vector of length 1) specifying the hyperprior standard deviation parameter for the prior distribution of beta regression coefficients. A value of 100 is suggested if the prior is "normal", 2.5 if "cauchy".
a numeric (vector of length 1) specifying the first shape parameter for the beta prior distribution of theta.
a numeric (vector of length 1) specifying the second shape parameter for the beta prior distribution of theta.
a character specifying the link function for the overdispersion model (theta). Currently, "identity" (default), "logit", "probit", "cloglog", and "loglog" are supported. If link.theta = "identity", the prior distribution for theta is a beta.
a character specifying the prior distribution for psi regression coefficients of the overdispersion model (not supported if link.theta="identity"). Currently, "normal" (default) and "cauchy" are supported.
a positive numeric (vector of length 1) specifying the hyperprior standard deviation parameter for the prior distribution of psi regression coefficients. A value of 100 is suggested if the prior is "normal", 2.5 if "cauchy".
a positive integer specifying the number of iterations for each chain (including warmup). The default is 5000.
the percentage of iterations per chain to discard.
a positive integer specifying the number of Markov chains. The default is 1.
a positive integer specifying the period for saving samples. The default is 1.
TRUE (default) or FALSE: flag indicating whether to print intermediate output.
additional arguments for rstan::sampling.
Let Y be a random variable whose distribution can be specified in the type argument and \(\mu\) be the mean of Y/n.
The flexreg_binom function links the parameter \(\mu\) to a linear predictor through a function \(g(\cdot)\) specified in link.mu:
$$g(\mu_i) = x_i^t \bold{\beta},$$ where \(\bold{\beta}\) is the vector of regression coefficients for the mean model.
By default, link.theta="identity", meaning that the overdispersion parameter \(\theta\) is assumed to be constant.
It is possible to extend the model by linking \(\theta\) to an additional (possibly overlapping) set of covariates through a proper link
function \(q(\cdot)\) specified in the link.theta argument: $$q(\theta_i) = z_i^t \bold{\psi},$$ where \(\bold{\psi}\) is the vector of regression coefficients for the overdispersion model.
In flexreg_binom, the regression model for the mean and, where appropriate, for the overdispersion parameter can be specified in the
formula argument with a formula of type \(y \sim x_1 + x_2 | z_1 + z_2\) where covariates on the left of ("|") are included in the regression model
for the mean and covariates on the right of ("|") are included in the regression model for the overdispersion.
If the second part is omitted, i.e., \(y \sim x_1 + x_2\), the overdispersion is assumed constant for each observation.
Ascari, R., and Migliorati, S. (2021). A new regression model for overdispersed binomial data accounting for outliers and an excess of zeros. Statistics in Medicine, 40(17), 3895--3914. doi:10.1002/sim.9005
if (FALSE) {
data(Bacteria)
fbb <- flexreg_binom(y~females, n=n, data=Bacteria, type="FBB")
}
Run the code above in your browser using DataLab