Calculate variable importance (VIMP) for a single variable or group of variables for training or test data.
# S3 method for rfsrc
vimp(object, xvar.names, m.target = NULL,
importance = c("permute", "random", "anti"), block.size = 10,
joint = FALSE, seed = NULL, do.trace = FALSE, ...)
An object of class (rfsrc, grow)
or
(rfsrc, forest)
. Requires forest=TRUE in the
original rfsrc
call.
Names of the x-variables to be used. If not specified all variables are used.
Character value for multivariate families specifying the target outcome to be used. If left unspecified, the algorithm will choose a default target.
Type of VIMP.
Specifies number of trees in a block when calculating VIMP.
Individual or joint VIMP?
Negative integer specifying seed for the random number generator.
Number of seconds between updates to the user on approximate time to completion.
Further arguments passed to or from other methods.
An object of class (rfsrc, predict)
containing importance
values.
Using a previously grown forest, calculate the VIMP for variables
xvar.names
. By default, VIMP is calculated for the original
data, but the user can specify a new test data for the VIMP
calculation using newdata
. See rfsrc
for more
details about how VIMP is calculated.
Joint VIMP is requested using joint and equals importance for a group of variables when the group is perturbed simultaneously.
Use option csv=TRUE
to request case specific VIMP. Applies to
all families except survival families. See example below.
Ishwaran H. (2007). Variable importance in binary regression trees and forests, Electronic J. Statist., 1:519-537.
# NOT RUN {
## ------------------------------------------------------------
## classification example
## showcase different vimp
## ------------------------------------------------------------
iris.obj <- rfsrc(Species ~ ., data = iris)
# Permutation vimp (default)
print(vimp(iris.obj)$importance)
# VIMP using brier prediction error
print(vimp(iris.obj, perf.type = "brier")$importance)
# Random daughter vimp
print(vimp(iris.obj, importance = "random")$importance)
# Joint permutation vimp
print(vimp(iris.obj, joint = TRUE)$importance)
# Paired vimp
print(vimp(iris.obj, c("Petal.Length", "Petal.Width"), joint = TRUE)$importance)
print(vimp(iris.obj, c("Sepal.Length", "Petal.Width"), joint = TRUE)$importance)
## ------------------------------------------------------------
## imbalanced classification example
## see the imbalanced function for more details
## ------------------------------------------------------------
data(breast, package = "randomForestSRC")
breast <- na.omit(breast)
f <- as.formula(status ~ .)
o <- rfsrc(f, breast, ntree = 2000)
## Breiman importance
print(100 * vimp(o)$importance)
## G-mean importance
print(100 * vimp(o, perf.type = "g.mean")$importance[, 1])
## ------------------------------------------------------------
## regression example
## ------------------------------------------------------------
airq.obj <- rfsrc(Ozone ~ ., airquality)
print(vimp(airq.obj))
## ------------------------------------------------------------
## regression example where vimp is calculated on test data
## ------------------------------------------------------------
set.seed(100080)
train <- sample(1:nrow(airquality), size = 80)
airq.obj <- rfsrc(Ozone~., airquality[train, ])
#training data vimp
print(airq.obj$importance)
print(vimp(airq.obj)$importance)
#test data vimp
print(vimp(airq.obj, newdata = airquality[-train, ])$importance)
## ------------------------------------------------------------
## case-specific vimp
## returns VIMP for each case
## ------------------------------------------------------------
o <- rfsrc(mpg~., mtcars)
v <- vimp(o, csv = TRUE)
csvimp <- get.mv.csvimp(v, standardize=TRUE)
print(csvimp)
## ------------------------------------------------------------
## case-specific joint vimp
## returns joint VIMP for each case
## ------------------------------------------------------------
o <- rfsrc(mpg~., mtcars)
v <- vimp(o, joint = TRUE, csv = TRUE)
csvimp <- get.mv.csvimp(v, standardize=TRUE)
print(csvimp)
## ------------------------------------------------------------
## case-specific joint vimp for multivariate regression
## returns joint VIMP for each case, for each outcome
## ------------------------------------------------------------
o <- rfsrc(Multivar(mpg, cyl) ~., data = mtcars)
v <- vimp(o, joint = TRUE, csv = TRUE)
csvimp <- get.mv.csvimp(v, standardize=TRUE)
print(csvimp)
# }
Run the code above in your browser using DataLab