Learn R Programming

⚠️There's a newer version (3.2-3) of this package.Take me there.

betareg: Beta Regression in R

Overview

  • Beta regression for modeling beta-distributed dependent variables on the open unit interval (0, 1), e.g., rates and proportions, see Cribari-Neto and Zeileis (2010, doi:10.18637/jss.v034.i02).

  • Extended-support beta regression models for variables on the closed unit interval [0, 1] with boundary observations at 0 and/or 1 see Kosmidis and Zeileis (2024, forthcoming).

  • Alternative specifications of the classical beta regression model: Bias-corrected and bias-reduced estimation, finite mixture models, and recursive partitioning for (0, 1) beta regression, see Grün, Kosmidis, and Zeileis (2012, doi:10.18637/jss.v048.i11).

Installation

The stable version of betareg is available on CRAN:

install.packages("betareg")

The latest development version can be installed from R-universe:

install.packages("betareg", repos = "https://zeileis.R-universe.dev")

Illustration

A nice first illustration of beta regression is the analysis of reading accuracy scores from primary school children from Smithson & Verkuilen (2006). Package and data can be loaded via:

library("betareg")
data("ReadingSkills", package = "betareg")

The reading accuracy was scaled to be within (0, 1). Its mean is explained by verbal iq score with separate lines by dyslexia (control vs. dyslexic). The precision parameter is explained by main effects of the two explanatory variables. More details are provided in ?ReadingSkills.

br <- betareg(accuracy ~ dyslexia * iq | dyslexia + iq, data = ReadingSkills)
summary(br)
#> 
#> Call:
#> betareg(formula = accuracy ~ dyslexia * iq | dyslexia + iq, data = ReadingSkills)
#> 
#> Quantile residuals:
#>     Min      1Q  Median      3Q     Max 
#> -2.3625 -0.5872  0.3026  0.9425  1.5874 
#> 
#> Coefficients (mean model with logit link):
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)   1.1232     0.1428   7.864 3.73e-15 ***
#> dyslexia     -0.7416     0.1428  -5.195 2.04e-07 ***
#> iq            0.4864     0.1331   3.653 0.000259 ***
#> dyslexia:iq  -0.5813     0.1327  -4.381 1.18e-05 ***
#> 
#> Phi coefficients (precision model with log link):
#>             Estimate Std. Error z value Pr(>|z|)    
#> (Intercept)   3.3044     0.2227  14.835  < 2e-16 ***
#> dyslexia      1.7466     0.2623   6.658 2.77e-11 ***
#> iq            1.2291     0.2672   4.600 4.23e-06 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 
#> 
#> Type of estimator: ML (maximum likelihood)
#> Log-likelihood:  65.9 on 7 Df
#> Pseudo R-squared: 0.5756
#> Number of iterations: 25 (BFGS) + 1 (Fisher scoring)

The regression summary shows that accuracy increases with iq for the control group but not for the dyslexic group (even slightly decreases). This can be brought out more clearly graphically. This also highlights that the model employs a logit link so that the fitted curves always remain within (0, 1).

pal <- palette.colors()[c(4, 8)]
pch <- c(19, 17)
plot(accuracy ~ iq, data = ReadingSkills, col = pal[dyslexia], pch = pch[dyslexia])
iq <- -30:30/10
lines(iq, predict(br, newdata = data.frame(dyslexia = "no", iq = iq)), col = pal[1], lwd = 2)
lines(iq, predict(br, newdata = data.frame(dyslexia = "yes", iq = iq)), col = pal[2], lwd = 2)
legend("topleft", c("Control", "Dyslexic"), pch = pch, col = pal, bty = "n")

Extended models

For going beyond this basic analysis the following extensions can be considered.

Bias reduction

Bias-reduced estimation (instead of the default maximum likelihood estimation) can be used by adding the argument type = "BR" in betareg(). This slightly shrinks all coefficient estimates but leads to qualitatively identical results.

betareg(accuracy ~ dyslexia * iq | dyslexia + iq, data = ReadingSkills, type = "BR")

Extended-support beta regression

To analyze the original accuracy scores in [0, 1] (without scaling the perfect scores of 1 to 0.99) use the variable accuracy1 in the code above. The betareg() model then estimates an additional exceedence parameter that accounts for the boundary probability of a perfect score.

betareg(accuracy1 ~ dyslexia * iq | dyslexia + iq, data = ReadingSkills)

Beta regression trees

To find subgroups in a beta regression by recursively splitting subsamples (rather than fixing the dyslexia interaction in advance), beta regression trees can be used:

betatree(accuracy ~ iq | iq, ~ dyslexia + ..., data = ReadingSkills, minsize = 10)

See the documentation of betatree() for more details.

Finite mixtures of beta regressions

To find clusters in a beta regression (without even having the dyslexia information), finite mixtures of beta regressions can be used:

betamix(accuracy ~ iq, data = ReadingSkills, k = 3, ...)

See the documentation of betamix() for more details.

Copy Link

Version

Install

install.packages('betareg')

Monthly Downloads

12,631

Version

3.2-0

License

GPL-2 | GPL-3

Maintainer

Achim Zeileis

Last Published

July 7th, 2024

Functions in betareg (3.2-0)

betamix

Finite Mixtures of Beta Regression for Rates and Proportions
XBeta

Create an Extended-Support Beta Distribution
XBetaX

Create an Extended-Support Beta Mixture Distribution
StressAnxiety

Dependency of Anxiety on Stress
betareg.control

Control Parameters for Beta Regression
beta4

The 4-Parameter Beta Distribution in Regression Parameterization
betatree

Beta Regression Trees
betareg

Beta Regression for Rates and Proportions
beta01

The Zero- and/or One-Inflated Beta Distribution in Regression Parameterization
WeatherTask

Weather Task With Priming and Precise and Imprecise Probabilities
xbetax

The Extended-Support Beta Mixture Distribution
xbeta

The Extended-Support Beta Distribution
plot.betareg

Diagnostic Plots for betareg Objects
predict.betareg

Prediction Method for betareg Objects
betar

The Beta Distribution in Regression Parameterization
residuals.betareg

Residuals Method for betareg Objects
summary.betareg

Methods for betareg Objects
gleverage

Generalized Leverage Values
BetaR

Create a Beta Regression Distribution
Beta01

Create a Zero- and/or One-Inflated Beta Distribution
ImpreciseTask

Imprecise Probabilities for Sunday Weather and Boeing Stock Task
Beta4

Create a 4-Parameter Beta Distribution
CarTask

Partition-primed Probability Judgement Task for Car Dealership
ReadingSkills

Dyslexia and IQ Predicting Reading Accuracy
FoodExpenditure

Proportion of Household Income Spent on Food
LossAversion

(No) Myopic Loss Aversion in Adolescents
GasolineYield

Estimation of Gasoline Yields from Crude Oil
MockJurors

Confidence of Mock Jurors in Their Verdicts