Learn R Programming

KnowGRRF (version 1.0)

select.stable.aic: Select a set of stable features based on AIC after an initial selection by GRRF

Description

Perform feature selection by GRRF and followed by stepwise model selection by AIC. Repeat it multiple times to select a stable set of features that are selected according to AIC.

Usage

select.stable.aic(X.train, Y.train, coefReg, total=10)

Arguments

X.train

a data frame or matrix (like x) containing predictors for the training set.

Y.train

response for the training set. If a factor, classification is assumed, otherwise regression is assumed. If omitted, will run in unsupervised mode.

coefReg

regularization coefficient chosen for RRF, ranges between 0 and 1.

total

the number of times to repeat the process.

Value

a stable set of features selected by GRRF

References

Guan, X., & Liu, L. (2018). Know-GRRF: Domain-Knowledge Informed Biomarker Discovery with Random Forests.

Examples

Run this code
# NOT RUN {
##---- Example: classification  ----
library(randomForest)

set.seed(1)
X.train<-data.frame(matrix(rnorm(100*100), nrow=100))
b=seq(0.1, 2.2, 0.2) 
##y has a linear relationship with first 10 variables
y.train=b[7]*X.train$X6+b[8]*X.train$X7+b[9]*X.train$X8+b[10]*X.train$X9+b[11]*X.train$X10 
y.train=ifelse(y.train>0, 1, 0) ##classification

##use RRF to impute regularized coefficients
imp<-randomForest(X.train, as.factor(y.train))$importance 
coefReg=0.5+0.5*imp/max(imp) 

##select a stable set of feature that are selected by GRRF followed by stepAIC
select.stable.aic(X.train, as.factor(y.train), coefReg)

# }

Run the code above in your browser using DataLab