RBE: Tools for a reparameterized beta regression model

Description

A set of functions related to the reparameterized beta regression model based on different measures of central tendency: mean, median, mode, geometric mean or harmonic mean.

Usage

BEAM(mu.link = "logit", sigma.link = "log") 
BEGM(mu.link = "logit", sigma.link = "log") 
BEHM(mu.link = "logit", sigma.link = "log") 
BEMD(mu.link = "logit", sigma.link = "log") 
BEMO(mu.link = "logit", sigma.link = "log") 
dBEAM(x, mu = 0.5, sigma = 1, log = FALSE) 
dBEGM(x, mu = 0.5, sigma = 1, log = FALSE) 
dBEHM(x, mu = 0.5, sigma = 1, log = FALSE) 
dBEMD(x, mu = 0.5, sigma = 1, log = FALSE) 
dBEMO(x, mu = 0.5, sigma = 1, log = FALSE) 
dRBE(x, mu=0.5, sigma=1, param="AM", log=FALSE)
fit.RBE(formula = formula(data), sigma.formula=~1, data, param="AM")
pBEAM(q, mu = 0.5, sigma = 1, lower.tail = TRUE, log.p = FALSE) 
pBEGM(q, mu = 0.5, sigma = 1, lower.tail = TRUE, log.p = FALSE) 
pBEHM(q, mu = 0.5, sigma = 1, lower.tail = TRUE, log.p = FALSE) 
pBEMD(q, mu = 0.5, sigma = 1, lower.tail = TRUE, log.p = FALSE) 
pBEMO(q, mu = 0.5, sigma = 1, lower.tail = TRUE, log.p = FALSE)
pRBE(q, mu=0.5, sigma=1, param="AM", lower.tail = TRUE, log.p = FALSE) 
qBEAM(p, mu = 0.5, sigma = 1, lower.tail = TRUE, log.p = FALSE) 
qBEGM(p, mu = 0.5, sigma = 1, lower.tail = TRUE, log.p = FALSE) 
qBEHM(p, mu = 0.5, sigma = 1, lower.tail = TRUE, log.p = FALSE) 
qBEMD(p, mu = 0.5, sigma = 1, lower.tail = TRUE, log.p = FALSE) 
qBEMO(p, mu = 0.5, sigma = 1, lower.tail = TRUE, log.p = FALSE) 
qRBE(p, mu=0.5, sigma=1, param="AM", lower.tail = TRUE, log.p = FALSE)
rBEAM(n, mu = 0.5, sigma = 1) 
rBEGM(n, mu = 0.5, sigma = 1) 
rBEHM(n, mu = 0.5, sigma = 1) 
rBEMD(n, mu = 0.5, sigma = 1) 
rBEMO(n, mu = 0.5, sigma = 1) 
rRBE(n, mu=0.5, sigma=1, param="AM")

Value

an object of class "rregm" is returned. The object returned for this functions is a list containing the following components:

estimate: A matrix containing the estimates and standard errors.
logLik: the log-likelihood function evaluated at the corresponding estimators.
AIC: the Akaike information criterion.
BIC: the Bayesian information criterion.
tau1, tau2: values for tau1 and tau2, depending on the considered parameterization.
pearson.res: Pearson's residuals.
mod.pearson.res: modified Pearson's residuals.
quant.res: quantile residuals.
convergence: logical. If convergence was attained.
dist: BE (the beta distribution).
param: The specified parameterization.
mu.x: design matrix for mu.
sigma.x: design matrix for sigma.

Arguments

mu.link: the mu link function with default logit
sigma.link: the sigma link function with default log
mu, sigma: vector of parameter values
formula: an object of class "formula" (or one that can be coerced to that class): a symbolic description of the model to be fitted. The details of model specification are given under ‘Details’.
data: an optional data frame, list or environment (or object coercible by as.data.frame to a data frame) containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which lm is called.
sigma.formula: a formula object for fitting a model to the sigma parameter, as in the formula above, e.g. sigma.formula=~x1+x2.
param: parameterization used for the model. "AM" for mean, "MD" for median, "MO" for mode, "GM" for geometric mean, and "HM" for harmonic mean.
x, q: vector of quantiles
p: vector of probabilities
n: number of observations. If $\mbox{length}(n) > 1$, the length is taken to be the number required.
log, log.p: logical; if TRUE, probabilities p are given as log(p).
lower.tail: logical; if TRUE, probabilities are $P(X \leq x)$ otherwise, $P(X>x)$.

Author

Diego Gallardo and Marcelo Bourguignon.

Details

The parameterization for the reparameterized beta distribution is given by $$ f(x; \mu, \sigma) = \frac{x^{\mu\,\sigma + \tau_1-1}(1 - x)^{(1-\mu)\sigma + \tau_2-\tau_1-1}}{B(\mu\,\sigma + \tau_1, (1-\mu)\sigma + \tau_2-\tau_1)}, \quad 0 < x < 1, $$ where $0 < \mu < 1$, $\sigma > 0$ and $\tau_1$ and $\tau_2$ are constant. The following cases are highlighted:

- param="AM": $\tau_1=\tau_2=0$ and $\mu$ represents the mean of the distribution.

- param="GM": $\tau_1=\tau_2=1/2$ and $\mu$ represents the geometric mean of the distribution.

- param="HM": $\tau_1=\tau_2=1$ and $\mu$ represents the harmonic mean of the distribution.

- param="MO": $\tau_1=1$ and $\tau_2=2$ and $\mu$ represents the mode of the distribution.

- param="MD": $\tau_1=1/2$ and $\tau_2=0$ and $\mu$ represents the median of the distribution.

Suppose the central tendency and the concentration parameter of $Y_i$ satisfies the following functional relations $$ \mbox{logit}(\mu_i) = \mathbf{x}^\top_i\bm{\xi} \quad \textrm{and} \quad \log(\sigma_i) = \eta_{2i} = \mathbf{z}^\top_i\bm{\nu}, $$ where $\mbox{logit}(u)=\log(u/(1-u))$ is the logit function, $\bm{\xi} = (\xi_1, \ldots, \xi_p)^\top$ and $\bm{\nu} = (\nu_1, \ldots, \nu_q)^\top$ are vectors of unknown regression coefficients which are assumed to be functionally independent, $\bm{\xi} \in \mathbb{R}^p$ and $\bm{\nu} \in \mathbb{R}^q$, with $p + q < n$, and $\mathbf{x}_i = (x_{i1}, \ldots, x_{ip})^\top$ and $\mathbf{z}_i = (z_{i1}, \ldots, z_{iq})^\top$ are observations on $p$ and $q$ known regressors, for $i = 1, \ldots, n$. Furthermore, we assume that the covariate matrices $\mathbf{X} = (\mathbf{x}_1, \ldots, \mathbf{x}_n)^\top$ and $\mathbf{Z} = (\mathbf{z}_1, \ldots, \mathbf{z}_n)^\top$ have rank $p$ and $q$, respectively.

For this model, the Pearson's residuals are given by $$ r_i=\frac{y_i-m_i}{s_i}, \quad i=1,\ldots,n, $$ where $$ m_i=\frac{\mu_i \sigma_i+\tau_1}{\sigma_i+\tau_2} \quad \mbox{and} \quad s_i=\sqrt{\frac{(\mu_i \sigma_i+\tau_1)((1-\mu_i)\sigma_i+\tau_2-\tau_1)}{(\sigma_i+\tau_2)^2(\sigma_i+\tau_2+1)}}. $$ whereas the modified Pearson's residuals are given by $$ r_i^*=\frac{\mbox{logit}(y_i)-m_i^*}{s_i^*}, \quad i=1,\ldots,n, $$ where $$ m_i^*=\psi(\mu_i \sigma_i+\tau_1)-\psi((1-\mu_i)\sigma_i+\tau_2-\tau_1) \quad \mbox{and} \quad s_i^*=\sqrt{\psi'(\mu_i \sigma_i+\tau_1)+\psi'((1-\mu_i)\sigma_i+\tau_2-\tau_1)}, $$ with $\psi(\cdot)$ and $\psi'(\cdot)$ denoting the digamma and trigamma functions, respectively. Finally, the quantile residuals are given by $$ r_i^q=\Phi^{-1}\left(I_{y_i}(\mu_i \sigma_i+\tau_1,(1-\mu_i)\sigma_i+\tau_2-\tau_1)\right), \quad i=1,\ldots,n, $$ where $\Phi^{-1}(\cdot)$ denotes the inverse of the cumulative distribution function for the standard normal model and $I_y(\alpha,\beta)=B_x(\alpha, \beta)/B(\alpha, \beta)$ is the incomplete beta function ratio, $B_x(\alpha, \beta) = \int_{0}^{x}\omega^{\alpha-1}(1-\omega)^{\beta-1}\textrm{d} \omega$ is the incomplete beta function, $B(\alpha, \beta) = \Gamma(\alpha)\Gamma(\beta)/\Gamma(\alpha + \beta)$ is the beta function and $\Gamma(\alpha) = \int_{0}^{\infty}\omega^{\alpha-1}\textrm{e}^{-\omega}\textrm{d} \omega$ is the gamma function. dRBE gives the density, pRBE gives the distribution function, qRBE gives the quantile function, and rRBE generates random deviates from the beta distribution with the specified parameterization. In addition, dBEXX, pBEXX, qBEXX and rBEXX also provides the equivalent functions for a specified parameterization for XX: AM (mean), GM (geometric mean), HM (harmonic mean), MD (median) and MO (mode). For instance, dBEAM gives the density for the beta model parameterized in the mean, pBEGM gives the distribution function for the beta model parameterized in the geometric mean and so on. Finally, the functions BEAM, BEGM, BEHM, BEMD and BEMO also provide a framework to fit models with gamlss.

References

Bourguignon, M., Gallardo, D.I. (2025) A general and unified parameterization of the beta distribution: A flexible and robust beta regression model. Statistica Neerlandica, 79(2), e70007.

Examples

Run this code

set.seed(2100)
n=100; x1=rnorm(max(n)) ##drawing covariates, the same for mu and sigma
mu=plogis(0.5-0.4*x1); sigma=exp(-0.1+0.05*x1)
y=rRBE(n, mu, sigma, param="MD") ## model parameterized in the median
data=list(y=y, x1=x1)
aux.RBE=fit.RBE(y~x1, sigma.formula=~x1, data=data, param="MD")
summary(aux.RBE)
qqnorm(res(aux.RBE, type="mod.pearson"))
#The beta model parameterized in the median also can be fitted using gamlss
#gamlss(y~x1, sigma.formula=~x1, data=data, family=BEMD)

Run the code above in your browser using DataLab