get_predicted_ci: Confidence intervals around predicted values

Description

Confidence intervals around predicted values

Usage

get_predicted_ci(x, predictions = NULL, ...)
# S3 method for default
get_predicted_ci(
  x,
  predictions = NULL,
  data = NULL,
  se = NULL,
  ci = 0.95,
  ci_type = "confidence",
  ci_method = NULL,
  dispersion_method = "sd",
  vcov = NULL,
  vcov_args = NULL,
  ...
)

Arguments

A statistical model (can also be a data.frame, in which case the second argument has to be a model).

predictions

A vector of predicted values (as obtained by stats::fitted(), stats::predict() or get_predicted()).

...

Other argument to be passed, for instance to get_predicted_ci().

data

An optional data frame in which to look for variables with which to predict. If omitted, the data used to fit the model is used. Visualization matrices can be generated using get_datagrid().

Numeric vector of standard error of predicted values. If NULL, standard errors are calculated based on the variance-covariance matrix.

The interval level (default 0.95, i.e., 95% CI).

ci_type

Can be "prediction" or "confidence". Prediction intervals show the range that likely contains the value of a new observation (in what range it would fall), whereas confidence intervals reflect the uncertainty around the estimated parameters (and gives the range of the link; for instance of the regression line in a linear regressions). Prediction intervals account for both the uncertainty in the model's parameters, plus the random variation of the individual values. Thus, prediction intervals are always wider than confidence intervals. Moreover, prediction intervals will not necessarily become narrower as the sample size increases (as they do not reflect only the quality of the fit). This applies mostly for "simple" linear models (like lm), as for other models (e.g., glm), prediction intervals are somewhat useless (for instance, for a binomial model for which the dependent variable is a vector of 1s and 0s, the prediction interval is... [0, 1]).

ci_method

The method for computing p values and confidence intervals. Possible values depend on model type.

NULL uses the default method, which varies based on the model type.
Most frequentist models: "normal" (default).
Bayesian models: "quantile" (default), "hdi", "eti".
Mixed effects lme4 models: "normal" (default), "satterthwaite", "kenward".

dispersion_method

Bootstrap dispersion and Bayesian posterior summary: "sd" or "mad".

vcov

Variance-covariance matrix used to compute uncertainty estimates (e.g., for robust standard errors). This argument accepts a covariance matrix, a function which returns a covariance matrix, or a string which identifies the function to be used to compute the covariance matrix.

A covariance matrix
A function which returns a covariance matrix (e.g., stats::vcov())
A string which indicates the kind of uncertainty estimates to return.
- Heteroskedasticity-consistent: "vcovHC", "HC", "HC0", "HC1", "HC2", "HC3", "HC4", "HC4m", "HC5". See ?sandwich::vcovHC
- Cluster-robust: "vcovCR", "CR0", "CR1", "CR1p", "CR1S", "CR2", "CR3". See ?clubSandwich::vcovCR()
- Bootstrap: "vcovBS", "xy", "residual", "wild", "mammen", "webb". See ?sandwich::vcovBS
- Other sandwich package functions: "vcovHAC", "vcovPC", "vcovCL", "vcovPL".

vcov_args

List of arguments to be passed to the function identified by the vcov argument. This function is typically supplied by the sandwich or clubSandwich packages. Please refer to their documentation (e.g., ?sandwich::vcovHAC) to see the list of available arguments.

Examples

Run this code

# NOT RUN {
# Confidence Intervals for Model Predictions
# ------------------------------------------

data(mtcars)

# Linear model
# ------------
x <- lm(mpg ~ cyl + hp, data = mtcars)
predictions <- predict(x)
ci_vals <- get_predicted_ci(x, predictions, ci_type = "prediction")
head(ci_vals)
ci_vals <- get_predicted_ci(x, predictions, ci_type = "confidence")
head(ci_vals)
ci_vals <- get_predicted_ci(x, predictions, ci = c(0.8, 0.9, 0.95))
head(ci_vals)

# Bootstrapped
# ------------
if (require("boot")) {
  predictions <- get_predicted(x, iterations = 500)
  get_predicted_ci(x, predictions)
}

if (require("datawizard") && require("bayestestR")) {
  ci_vals <- get_predicted_ci(x, predictions, ci = c(0.80, 0.95))
  head(ci_vals)
  datawizard::reshape_ci(ci_vals)

  ci_vals <- get_predicted_ci(x,
    predictions,
    dispersion_method = "MAD",
    ci_method = "HDI"
  )
  head(ci_vals)
}


# Logistic model
# --------------
x <- glm(vs ~ wt, data = mtcars, family = "binomial")
predictions <- predict(x, type = "link")
ci_vals <- get_predicted_ci(x, predictions, ci_type = "prediction")
head(ci_vals)
ci_vals <- get_predicted_ci(x, predictions, ci_type = "confidence")
head(ci_vals)
# }

Run the code above in your browser using DataLab