phi
(if applicable).
stan_betareg(formula, data, subset, na.action, weights, offset, link = c("logit", "probit", "cloglog", "cauchit", "log", "loglog"), link.phi = NULL, model = TRUE, y = TRUE, x = FALSE, ..., prior = normal(), prior_intercept = normal(), prior_z = normal(), prior_intercept_z = normal(), prior_phi = cauchy(0, 5), prior_PD = FALSE, algorithm = c("sampling", "optimizing", "meanfield", "fullrank"), adapt_delta = NULL, QR = FALSE)
stan_betareg.fit(x, y, z = NULL, weights = rep(1, NROW(x)), offset = rep(0, NROW(x)), link = c("logit", "probit", "cloglog", "cauchit", "log", "loglog"), link.phi = NULL, ..., prior = normal(), prior_intercept = normal(), prior_z = normal(), prior_intercept_z = normal(), prior_phi = cauchy(0, 5), prior_PD = FALSE, algorithm = c("sampling", "optimizing", "meanfield", "fullrank"), adapt_delta = NULL, QR = FALSE)
betareg
.betareg
, but
rarely specified.x
). Currently, "logit", "probit",
"cloglog", "cauchit", "log", and "loglog" are supported.phi
(specified through z
). Currently,
"identity", "log" (default), and "sqrt" are supported. Since the "sqrt"
link function is known to be unstable, it is advisable to specify a
different link function (or to model phi
as a scalar parameter
instead of via a linear predictor by excluding z
from the
formula
and excluding link.phi
).betareg
.stan_betareg
, logical scalars indicating whether to
return the design matrix and response vector. In stan_betareg.fit
,
a design matrix and response vector.sampling
, vb
, or
optimizing
), corresponding to the estimation method
named by algorithm
. For example, if algorithm
is
"sampling"
it is possibly to specify iter
, chains
,
cores
, refresh
, etc.prior
should be a call to one of the various functions provided by
rstanarm for specifying priors. The subset of these functions that
can be used for the prior on the coefficients can be grouped into several
"families":Family |
Functions |
Student t family |
normal , student_t , cauchy |
Hierarchical shrinkage family |
hs , hs_plus |
Laplace family |
laplace , lasso |
Product normal family |
product_normal |
See the priors help page for details on the families and
how to specify the arguments for all of the functions in the table above.
To omit a prior ---i.e., to use a flat (improper) uniform prior---
prior
can be set to NULL
, although this is rarely a good
idea.
Note: Unless QR=TRUE
, if prior
is from the Student t
family or Laplace family, and if the autoscale
argument to the
function used to specify the prior (e.g. normal
) is left at
its default and recommended value of TRUE
, then the default or
user-specified prior scale(s) may be adjusted internally based on the scales
of the predictors. See the priors help page for details on
the rescaling and the prior_summary
function for a summary of
the priors used for a particular model.
prior_intercept
can be a call to normal
, student_t
or
cauchy
. See the priors help page for details on
these functions. To omit a prior on the intercept ---i.e., to use a flat
(improper) uniform prior--- prior_intercept
can be set to
NULL
.Note: If using a dense representation of the design matrix
---i.e., if the sparse
argument is left at its default value of
FALSE
--- then the prior distribution for the intercept is set so it
applies to the value when all predictors are centered.
phi
(if applicable). Same options as for prior
.phi
(if applicable). Same options as for prior_intercept
.phi
if it is not
modeled as a function of predictors. If z
variables are specified
then prior_phi
is ignored and prior_intercept_z
and
prior_z
are used to specify the priors on the intercept and
coefficients in the model for phi
. When applicable, prior_phi
can be a call to exponential
to use an exponential distribution, or
one of normal
, student_t
or cauchy
to use half-normal,
half-t, or half-Cauchy prior. See priors
for details on these
functions. To omit a prior ---i.e., to use a flat (improper) uniform
prior--- set prior_phi
to NULL
.FALSE
) indicating
whether to draw from the prior predictive distribution instead of
conditioning on the outcome."sampling"
for MCMC (the
default), "optimizing"
for optimization, "meanfield"
for
variational inference with independent normal distributions, or
"fullrank"
for variational inference with a multivariate normal
distribution. See rstanarm-package
for more details on the
estimation algorithms. NOTE: not all fitting functions support all four
algorithms.algorithm="sampling"
. See
adapt_delta
for details.FALSE
) but if TRUE
applies a scaled qr
decomposition to the design matrix,
$X = Q* R*$, where $Q* =
Q (n-1)^0.5$ and $R* = (n-1)^(-0.5)
R$. The coefficients relative to $Q*$ are obtained and then
premultiplied by the inverse of $R*$ to obtain coefficients
relative to the original predictors, $X$. These transformations do not
change the likelihood of the data but are recommended for computational
reasons when there are multiple predictors. However, because when QR
is TRUE
the prior
argument applies to the coefficients
relative to $Q*$ (and those are not very interpretable) it is
hard to specify an informative prior. Setting QR=TRUE
is therefore
only recommended if you do not have an informative prior for the regression
coefficients.stan_betareg.fit
, a regressor matrix for phi
.
Defaults to an intercept only.stan_betareg
function is similar in syntax to
betareg
but rather than performing maximum
likelihood estimation, full Bayesian estimation is performed (if
algorithm
is "sampling"
) via MCMC. The Bayesian model adds
priors (independent by default) on the coefficients of the beta regression
model. The stan_betareg
function calls the workhorse
stan_betareg.fit
function, but it is also possible to call the
latter directly.
stanreg-methods
and
betareg
.The vignette for stan_betareg
.
### Simulated data
N <- 200
x <- rnorm(N, 2, 1)
z <- rnorm(N, 2, 1)
mu <- binomial(link = "logit")$linkinv(1 + 0.2*x)
phi <- exp(1.5 + 0.4*z)
y <- rbeta(N, mu * phi, (1 - mu) * phi)
hist(y, col = "dark grey", border = FALSE, xlim = c(0,1))
fake_dat <- data.frame(y, x, z)
fit <- stan_betareg(y ~ x | z, data = fake_dat,
link = "logit", link.phi = "log",
chains = 1, iter = 250) # for speed
print(fit, digits = 2)
plot(fit)
pp_check(fit)
prior_summary(fit)
Run the code above in your browser using DataLab