enspls.fs: Ensemble Sparse Partial Least Squares for Measuring Feature Importance

Description

Measuring feature importance with ensemble sparse partial least squares.

Usage

enspls.fs(x, y, maxcomp = 5L, cvfolds = 5L, alpha = seq(0.2, 0.8, 0.2), reptimes = 500L, method = c("mc", "boot"), ratio = 0.8, parallel = 1L)

Arguments

Predictor matrix.

Response vector.

maxcomp

Maximum number of components included within each model. If not specified, will use 5 by default.

cvfolds

Number of cross-validation folds used in each model for automatic parameter selection, default is 5.

alpha

Parameter (grid) controlling sparsity of the model. If not specified, default is seq(0.2, 0.8, 0.2).

reptimes

Number of models to build with Monte-Carlo resampling or bootstrapping.

method

Resampling method. "mc" (Monte-Carlo resampling) or "boot" (bootstrapping). Default is "mc".

ratio

Sampling ratio used when method = "mc".

parallel

Integer. Number of CPU cores to use. Default is 1 (not parallelized).

Value

A list containing two components:

variable.importance - a vector of variable importance
coefficient.matrix - original coefficient matrix

Examples

Run this code

data("logd1k")
x = logd1k$x
y = logd1k$y

set.seed(42)
fs = enspls.fs(x, y, reptimes = 5, maxcomp = 2)
print(fs, nvar = 10)
plot(fs, nvar = 10)
plot(fs, type = 'boxplot', limits = c(0.05, 0.95), nvar = 10)

Run the code above in your browser using DataLab

Description

Usage

Arguments

Value

See Also

Examples