Learn R Programming

collinear (version 1.1.1)

toy: One response and four predictors with varying levels of multicollinearity

Description

Data frame with known relationship between responses and predictors useful to illustrate multicollinearity concepts. Created from vi using the code shown in the example.

Usage

data(toy)

Arguments

Format

Data frame with 2000 rows and 5 columns.

Details

Columns:

  • y: response variable generated from a * 0.75 + b * 0.25 + noise.

  • a: most important predictor of y, uncorrelated with b.

  • b: second most important predictor of y, uncorrelated with a.

  • c: generated from a + noise.

  • d: generated from (a + b)/2 + noise.

These are variance inflation factors of the predictors in toy. variable vif b 4.062 d 6.804 c 13.263 a 16.161

Examples

Run this code

library(collinear)
library(dplyr)
data(vi)
set.seed(1)
toy <- vi |>
  dplyr::slice_sample(n = 2000) |>
  dplyr::transmute(
    a = soil_clay,
    b = humidity_range
  ) |>
  scale() |>
  as.data.frame() |>
  dplyr::mutate(
    y = a * 0.75 + b * 0.25 + runif(n = dplyr::n(), min = -0.5, max = 0.5),
    c = a + runif(n = dplyr::n(), min = -0.5, max = 0.5),
    d = (a + b) / 2 + runif(n = dplyr::n(), min = -0.5, max = 0.5)
  ) |>
  dplyr::transmute(y, a, b, c, d)

Run the code above in your browser using DataLab