Given a data frame and a model, adds draws from the (possibly transformed) posterior "fit" (aka the linear/link-level predictor), the posterior predictions of the model, or the residuals of a model to the data frame in a long format.
add_fitted_draws(
newdata,
model,
value = ".value",
...,
n = NULL,
seed = NULL,
re_formula = NULL,
category = ".category",
dpar = FALSE,
scale = c("response", "linear")
)fitted_draws(
model,
newdata,
value = ".value",
...,
n = NULL,
seed = NULL,
re_formula = NULL,
category = ".category",
dpar = FALSE,
scale = c("response", "linear")
)
add_linpred_draws(
newdata,
model,
value = ".value",
...,
n = NULL,
seed = NULL,
re_formula = NULL,
category = ".category",
dpar = FALSE,
scale = c("response", "linear")
)
linpred_draws(
model,
newdata,
value = ".value",
...,
n = NULL,
seed = NULL,
re_formula = NULL,
category = ".category",
dpar = FALSE,
scale = c("response", "linear")
)
# S3 method for default
fitted_draws(model, newdata, ...)
# S3 method for stanreg
fitted_draws(
model,
newdata,
value = ".value",
...,
n = NULL,
seed = NULL,
re_formula = NULL,
category = ".category",
dpar = FALSE,
scale = c("response", "linear")
)
# S3 method for brmsfit
fitted_draws(
model,
newdata,
value = ".value",
...,
n = NULL,
seed = NULL,
re_formula = NULL,
category = ".category",
dpar = FALSE,
scale = c("response", "linear")
)
add_predicted_draws(
newdata,
model,
prediction = ".prediction",
...,
n = NULL,
seed = NULL,
re_formula = NULL,
category = ".category"
)
predicted_draws(
model,
newdata,
prediction = ".prediction",
...,
n = NULL,
seed = NULL,
re_formula = NULL,
category = ".category"
)
# S3 method for default
predicted_draws(model, newdata, ...)
# S3 method for stanreg
predicted_draws(
model,
newdata,
prediction = ".prediction",
...,
n = NULL,
seed = NULL,
re_formula = NULL,
category = ".category"
)
# S3 method for brmsfit
predicted_draws(
model,
newdata,
prediction = ".prediction",
...,
n = NULL,
seed = NULL,
re_formula = NULL,
category = ".category"
)
add_residual_draws(
newdata,
model,
residual = ".residual",
...,
n = NULL,
seed = NULL,
re_formula = NULL,
category = ".category"
)
residual_draws(
model,
newdata,
residual = ".residual",
...,
n = NULL,
seed = NULL,
re_formula = NULL,
category = ".category"
)
# S3 method for default
residual_draws(model, newdata, ...)
# S3 method for brmsfit
residual_draws(
model,
newdata,
residual = ".residual",
...,
n = NULL,
seed = NULL,
re_formula = NULL,
category = ".category"
)
Data frame to generate predictions from. If omitted, most model types will generate predictions from the data used to fit the model.
A supported Bayesian model fit that can provide fits and predictions. Supported models
are listed in the second section of tidybayes-models: Models Supporting Prediction. While other
functions in this package (like spread_draws()
) support a wider range of models, to work with
add_fitted_draws
and add_predicted_draws
a model must provide an interface for generating
predictions, thus more generic Bayesian modeling interfaces like runjags
and rstan
are not directly
supported for these functions (only wrappers around those languages that provide predictions, like rstanarm
and brm
, are supported here).
The name of the output column for fitted_draws
; default ".value"
.
Additional arguments passed to the underlying prediction method for the type of model given.
The number of draws per prediction / fit to return, or NULL
to return all draws.
A seed to use when subsampling draws (i.e. when n
is not NULL
).
formula containing group-level effects to be considered in the prediction.
If NULL
(default), include all group-level effects; if NA
, include no group-level effects.
Some model types (such as brms::brmsfit and rstanarm::stanreg-objects) allow
marginalizing over grouping factors by specifying new levels of a factor in newdata
. In the case of
brms::brm()
, you must also pass allow_new_levels = TRUE
here to include new levels (see
brms::predict.brmsfit()
).
For some ordinal, multinomial, and multivariate models (notably, brms::brm()
models but
not rstanarm::stan_polr()
models), multiple sets of rows will be returned per input row for
fitted_draws
or predicted_draws
, depending on the model type. For ordinal/multinomial models,
these rows correspond to different categories of the response variable. For multivariate models, these correspond to
different response variables. The category
argument specifies the name of the column
to put the category names (or variable names) into in the resulting data frame. The default name of this column
(".category"
) reflects the fact that this functionality was originally used only for ordinal models and
has been re-used for multivariate models. The fact that multiple rows per response are returned only for some
model types reflects the fact that tidybayes takes the approach of tidying whatever output is given to us, and
the output from different modeling functions differs on this point.
See vignette("tidy-brms")
and vignette("tidy-rstanarm")
for examples of dealing with output
from ordinal models using both approaches.
For fitted_draws
and add_fitted_draws
: Should distributional regression
parameters be included in the output? Valid only for models that support distributional regression parameters,
such as submodels for variance parameters (as in brm
). If TRUE
, distributional regression
parameters are included in the output as additional columns named after each parameter
(alternative names can be provided using a list or named vector, e.g. c(sigma.hat = "sigma")
would output the "sigma"
parameter from a model as a column named "sigma.hat"
).
If FALSE
(the default), distributional regression parameters are not included.
Either "response"
or "linear"
. If "response"
, results are returned
on the scale of the response variable. If "linear"
, fitted values are returned on the scale of
the linear predictor.
The name of the output column for predicted_draws
; default ".prediction"
.
The name of the output column for residual_draws
; default ".residual"
.
A data frame (actually, a tibble) with a .row
column (a
factor grouping rows from the input newdata
), .chain
column (the chain
each draw came from, or NA
if the model does not provide chain information),
.iteration
column (the iteration the draw came from, or NA
if the model does
not provide iteration information), and a .draw
column (a unique index corresponding to each draw
from the distribution). In addition, fitted_draws
includes a column with its name specified by
the value
argument (default is .value
) containing draws from the (transformed) linear predictor,
and predicted_draws
contains a .prediction
column containing draws from the posterior predictive
distribution. For convenience, the resulting data frame comes grouped by the original input rows.
add_fitted_draws
adds draws from (possibly transformed) posterior linear predictors (or "link-level" predictors) to
the data. It corresponds to rstanarm::posterior_linpred()
in rstanarm
or
brms::fitted.brmsfit()
in brms
.
add_predicted_draws
adds draws from posterior predictions to
the data. It corresponds to rstanarm::posterior_predict()
in rstanarm
or
brms::predict.brmsfit()
in brms
.
add_fitted_draws
and fitted_draws
are alternate spellings of the
same function with opposite order of the first two arguments to facilitate use in data
processing pipelines that start either with a data frame or a model. Similarly,
add_predicted_draws
and predicted_draws
are alternate spellings.
Given equal choice between the two, add_fitted_draws
and add_predicted_draws
are the preferred spellings.
add_linpred_draws
and linpred_draws
are alternative spellings of fitted_draws
and add_fitted_draws
for consistency with rstanarm
terminology (specifically
rstanarm::posterior_linpred()
).
add_draws()
for the variant of these functions for use with packages that do not have
explicit support for these functions yet. See spread_draws()
for manipulating posteriors directly.
# NOT RUN {
library(ggplot2)
library(dplyr)
if (
require("rstanarm", quietly = TRUE) &&
require("modelr", quietly = TRUE)
) {
theme_set(theme_light())
m_mpg = stan_glm(mpg ~ hp * cyl, data = mtcars,
# 1 chain / few iterations just so example runs quickly
# do not use in practice
chains = 1, iter = 500)
# draw 100 fit lines from the posterior and overplot them
mtcars %>%
group_by(cyl) %>%
data_grid(hp = seq_range(hp, n = 101)) %>%
add_fitted_draws(m_mpg, n = 100) %>%
ggplot(aes(x = hp, y = mpg, color = ordered(cyl))) +
geom_line(aes(y = .value, group = paste(cyl, .draw)), alpha = 0.25) +
geom_point(data = mtcars)
# plot posterior predictive intervals
mtcars %>%
group_by(cyl) %>%
data_grid(hp = seq_range(hp, n = 101)) %>%
add_predicted_draws(m_mpg) %>%
ggplot(aes(x = hp, y = mpg, color = ordered(cyl))) +
stat_lineribbon(aes(y = .prediction), .width = c(.99, .95, .8, .5), alpha = 0.25) +
geom_point(data = mtcars) +
scale_fill_brewer(palette = "Greys")
}
# }
Run the code above in your browser using DataLab