Carries out the selection step of fuzzyforest algorithm. Returns data.frame with variable importances and top rated features.
select_RF(X, y, drop_fraction, number_selected, mtry_factor, ntree_factor,
min_ntree, num_processors, nodesize)
A data.frame. Each column corresponds to a feature vectors. Could include additional covariates not a part of the original modules.
Response vector.
A number between 0 and 1. Percentage of features dropped at each iteration.
Number of features selected by fuzzyforest.
In the case of regression, mtry
is set to
ceiling
(\(\sqrt(p)\)*mtry_factor
).
In the case of classification, mtry
is set to
ceiling
((p/3)*mtry_factor
). If either
of these numbers is greater than p, mtry
is
set to p.
A number greater than 1. ntree
for each
random is ntree_factor
times the number
of features. For each random forest, ntree
is set to max
(min_ntree
,
ntree_factor
*p
).
Minimum number of trees grown in each random forest.
Number of processors used to fit random forests.
Minimum nodesize
A data.frame with the top ranked features.