rwa: Create a Relative Weights Analysis (RWA)

Description

This function creates a Relative Weights Analysis (RWA) and returns a list of outputs. RWA provides a heuristic method for estimating the relative weight of predictor variables in multiple regression, which involves creating a multiple regression with on a set of transformed predictors which are orthogonal to each other but maximally related to the original set of predictors. rwa() is optimised for dplyr pipes and shows positive / negative signs for weights.

Usage

rwa(
  df,
  outcome,
  predictors,
  applysigns = FALSE,
  sort = TRUE,
  bootstrap = FALSE,
  n_bootstrap = 1000,
  conf_level = 0.95,
  focal = NULL,
  comprehensive = FALSE,
  include_rescaled_ci = FALSE
)

Value

rwa() returns a list of outputs, as follows:

predictors: character vector of names of the predictor variables used.
rsquare: the rsquare value of the regression model.
result: the final output of the importance metrics (sorted by Rescaled.RelWeight in descending order by default).
- The Rescaled.RelWeight column sums up to 100.
- The Sign column indicates whether a predictor is positively or negatively correlated with the outcome.
- When bootstrap = TRUE, includes confidence interval columns for raw weights.
- Rescaled weight CIs are available via include_rescaled_ci = TRUE but not recommended for inference.
n: indicates the number of observations used in the analysis.
bootstrap: bootstrap results (only present when bootstrap = TRUE), containing:
- ci_results: confidence intervals for weights
- boot_object: raw bootstrap object for advanced analysis
- n_bootstrap: number of bootstrap samples used
lambda:
RXX: Correlation matrix of all the predictor variables against each other.
RXY: Correlation values of the predictor variables against the outcome variable.

Arguments

df: Data frame or tibble to be passed through.
outcome: Outcome variable, to be specified as a string or bare input. Must be a numeric variable.
predictors: Predictor variable(s), to be specified as a vector of string(s) or bare input(s). All variables must be numeric.
applysigns: Logical value specifying whether to show an estimate that applies the sign. Defaults to FALSE.
sort: Logical value specifying whether to sort results by rescaled relative weights in descending order. Defaults to TRUE.
bootstrap: Logical value specifying whether to calculate bootstrap confidence intervals. Defaults to FALSE.
n_bootstrap: Number of bootstrap samples to use when bootstrap = TRUE. Defaults to 1000.
conf_level: Confidence level for bootstrap intervals. Defaults to 0.95.
focal: Focal variable for bootstrap comparisons (optional).
comprehensive: Whether to run comprehensive bootstrap analysis including random variable and focal comparisons.
include_rescaled_ci: Logical value specifying whether to include confidence intervals for rescaled weights. Defaults to FALSE due to compositional data constraints. Use with caution.

Details

rwa() produces raw relative weight values (epsilons) as well as rescaled weights (scaled as a percentage of predictable variance) for every predictor in the model. Signs are added to the weights when the applysigns argument is set to TRUE. See https://www.scotttonidandel.com/rwa-web for the original implementation that inspired this package.

Examples

Run this code

library(ggplot2)
# Basic RWA (results sorted by default)
rwa(diamonds,"price",c("depth","carat"))

# RWA without sorting (preserves original predictor order)
rwa(diamonds,"price",c("depth","carat"), sort = FALSE)

# \donttest{
# For faster examples, use a subset of data for bootstrap
diamonds_small <- diamonds[sample(nrow(diamonds), 1000), ]

# RWA with bootstrap confidence intervals (raw weights only)
rwa(diamonds_small,"price",c("depth","carat"), bootstrap = TRUE, n_bootstrap = 100)

# Include rescaled weight CIs (use with caution for inference)
rwa(diamonds_small,"price",c("depth","carat"), bootstrap = TRUE, 
    include_rescaled_ci = TRUE, n_bootstrap = 100)

# Comprehensive bootstrap analysis with focal variable
result <- rwa(diamonds_small,"price",c("depth","carat","table"), 
              bootstrap = TRUE, comprehensive = TRUE, focal = "carat", 
              n_bootstrap = 100)
# View confidence intervals
result$bootstrap$ci_results
# }

Run the code above in your browser using DataLab