Learn R Programming

vimp (version 2.3.5)

average_vim: Average multiple independent importance estimates

Description

Average the output from multiple calls to vimp_regression, for different independent groups, into a single estimate with a corresponding standard error and confidence interval.

Usage

average_vim(..., weights = rep(1/length(list(...)), length(list(...))))

Value

an object of class vim containing the (weighted) average of the individual importance estimates, as well as the appropriate standard error and confidence interval. This results in a list containing:

s

- a list of the column(s) to calculate variable importance for

SL.library

- a list of the libraries of learners passed to SuperLearner

full_fit

- a list of the fitted values of the chosen method fit to the full data

red_fit

- a list of the fitted values of the chosen method fit to the reduced data

est

- a vector with the corrected estimates

naive

- a vector with the naive estimates

update

- a list with the influence curve-based updates

mat

- a matrix with the estimated variable importance, the standard error, and the \((1-\alpha) \times 100\)% confidence interval

full_mod

- a list of the objects returned by the estimation procedure for the full data regression (if applicable)

red_mod

- a list of the objects returned by the estimation procedure for the reduced data regression (if applicable)

alpha

- the level, for confidence interval calculation

y

- a list of the outcomes

Arguments

...

an arbitrary number of vim objects.

weights

how to average the vims together, and must sum to 1; defaults to 1/(number of vims) for each vim, corresponding to the arithmetic mean

Examples

Run this code
# generate the data
p <- 2
n <- 100
x <- data.frame(replicate(p, stats::runif(n, -5, 5)))

# apply the function to the x's
smooth <- (x[,1]/5)^2*(x[,1]+7)/5 + (x[,2]/3)^2

# generate Y ~ Normal (smooth, 1)
y <- smooth + stats::rnorm(n, 0, 1)

# set up a library for SuperLearner; note simple library for speed
library("SuperLearner")
learners <- c("SL.glm", "SL.mean")

# get estimates on independent splits of the data
samp <- sample(1:n, n/2, replace = FALSE)

# using Super Learner (with a small number of folds, for illustration only)
est_2 <- vimp_regression(Y = y[samp], X = x[samp, ], indx = 2, V = 2,
           run_regression = TRUE, alpha = 0.05,
           SL.library = learners, cvControl = list(V = 2))

est_1 <- vimp_regression(Y = y[-samp], X = x[-samp, ], indx = 2, V = 2,
           run_regression = TRUE, alpha = 0.05,
           SL.library = learners, cvControl = list(V = 2))

ests <- average_vim(est_1, est_2, weights = c(1/2, 1/2))

Run the code above in your browser using DataLab