predict.brmsfit
Model Predictions of brmsfit
Objects
Predict responses based on the fitted model.
Can be performed for the data used to fit the model
(posterior predictive checks) or for new data.
By definition, these predictions have higher variance than
predictions of the fitted values (i.e., the 'regression line')
performed by the fitted
method. This is because the measurement error is incorporated.
The estimated means of both methods should, however, be very similar.
Usage
# S3 method for brmsfit
predict(object, newdata = NULL, re_formula = NULL,
transform = NULL, allow_new_levels = FALSE,
sample_new_levels = "uncertainty", new_objects = list(),
incl_autocor = TRUE, negative_rt = FALSE, subset = NULL,
nsamples = NULL, sort = FALSE, nug = NULL, ntrys = 5,
summary = TRUE, robust = FALSE, probs = c(0.025, 0.975), ...)# S3 method for brmsfit
posterior_predict(object, newdata = NULL,
re_formula = NULL, transform = NULL, allow_new_levels = FALSE,
sample_new_levels = "uncertainty", new_objects = list(),
incl_autocor = TRUE, negative_rt = FALSE, subset = NULL,
nsamples = NULL, sort = FALSE, nug = NULL, ntrys = 5,
robust = FALSE, probs = c(0.025, 0.975), ...)
Arguments
- object
An object of class
brmsfit
- newdata
An optional data.frame for which to evaluate predictions. If
NULL
(default), the orginal data of the model is used.- re_formula
formula containing group-level effects to be considered in the prediction. If
NULL
(default), include all group-level effects; ifNA
, include no group-level effects.- transform
A function or a character string naming a function to be applied on the predicted responses before summary statistics are computed.
- allow_new_levels
A flag indicating if new levels of group-level effects are allowed (defaults to
FALSE
). Only relevant ifnewdata
is provided.- sample_new_levels
Indicates how to sample new levels for grouping factors specified in
re_formula
. This argument is only relevant ifnewdata
is provided andallow_new_levels
is set toTRUE
. If"uncertainty"
(default), include group-level uncertainty in the predictions based on the variation of the existing levels. If"gaussian"
, sample new levels from the (multivariate) normal distribution implied by the group-level standard deviations and correlations. This options may be useful for conducting Bayesian power analysis. If"old_levels"
, directly sample new levels from the existing levels.- new_objects
A named
list
of objects containing new data, which cannot be passed via argumentnewdata
. Currently, only required for objects passed tocor_sar
andcor_fixed
.- incl_autocor
A flag indicating if ARMA autocorrelation parameters should be included in the predictions. Defaults to
TRUE
. Setting it toFALSE
will not affect other correlation structures such ascor_bsts
, orcor_fixed
.- negative_rt
Only relevant for Wiener diffusion models. A flag indicating whether response times of responses on the lower boundary should be returned as negative values. This allows to distinquish responses on the upper and lower boundary. Defaults to
FALSE
.- subset
A numeric vector specifying the posterior samples to be used. If
NULL
(the default), all samples are used.- nsamples
Positive integer indicating how many posterior samples should be used. If
NULL
(the default) all samples are used. Ignored ifsubset
is notNULL
.- sort
Logical. Only relevant for time series models. Indicating whether to return predicted values in the original order (
FALSE
; default) or in the order of the time series (TRUE
).- nug
Small positive number for Gaussian process terms only. For numerical reasons, the covariance matrix of a Gaussian process might not be positive definite. Adding a very small number to the matrix's diagonal often solves this problem. If
NULL
(the default),nug
is chosen internally.- ntrys
Parameter used in rejection sampling for truncated discrete models only (defaults to
5
). See Details for more information.- summary
Should summary statistics (i.e. means, sds, and 95% intervals) be returned instead of the raw values? Default is
TRUE
.- robust
If
FALSE
(the default) the mean is used as the measure of central tendency and the standard deviation as the measure of variability. IfTRUE
, the median and the median absolute deivation (MAD) are applied instead. Only used ifsummary
isTRUE
.- probs
The percentiles to be computed by the
quantile
function. Only used ifsummary
isTRUE
.- ...
Currently ignored.
Details
NA
values within factors in newdata
,
are interpreted as if all dummy variables of this factor are
zero. This allows, for instance, to make predictions of the grand mean
when using sum coding.
Method posterior_predict.brmsfit
is an alias of
predict.brmsfit
with summary = FALSE
.
For truncated discrete models only:
In the absence of any general algorithm to sample
from truncated discrete distributions,
rejection sampling is applied in this special case.
This means that values are sampled until
a value lies within the defined truncation boundaries.
In practice, this procedure may be rather slow (especially in R).
Thus, we try to do approximate rejection sampling
by sampling each value ntrys
times and then select a valid value.
If all values are invalid, the closest boundary is used, instead.
If there are more than a few of these pathological cases,
a warning will occure suggesting to increase argument ntrys
.
Value
Predicted values of the response variable.
If summary = TRUE
the output depends on the family:
For catagorical and ordinal families, it is a N x C matrix,
where N is the number of observations and
C is the number of categories.
For all other families, it is a N x E matrix where E is equal
to length(probs) + 2
.
If summary = FALSE
, the output is as a S x N matrix,
where S is the number of samples.
Examples
# NOT RUN {
## fit a model
fit <- brm(time | cens(censored) ~ age + sex + (1+age||patient),
data = kidney, family = "exponential", inits = "0")
## predicted responses
pp <- predict(fit)
head(pp)
## predicted responses excluding the group-level effect of age
pp2 <- predict(fit, re_formula = ~ (1|patient))
head(pp2)
## predicted responses of patient 1 for new data
newdata <- data.frame(sex = factor(c("male", "female")),
age = c(20, 50),
patient = c(1, 1))
predict(fit, newdata = newdata)
# }
# NOT RUN {
# }