# naiveWrapper: Naive feature selection method utilising the rFerns shadow imporance

## Description

Proof-of-concept ensemble of rFerns models, built to stabilise and improve selection based on shadow importance.
It employs a super-ensemble of `iterations`

small rFerns forests, each built on a subspace of `size`

attributes, which is selected randomly, but with a higher selection probability for attributes claimed important by previous sub-models.
Final selection is a group of attributes which hold a substantial weight at the end of the procedure.

## Usage

naiveWrapper(
x,
y,
iterations = 1000,
depth = 5,
ferns = 100,
size = 30,
lambda = 5,
threads = 0,
saveHistory = FALSE
)

## Arguments

x

Data frame containing attributes; must have unique names and contain only numeric, integer or (ordered) factor columns.
Factors must have less than 31 levels. No `NA`

values are permitted.

y

A decision vector. Must a factor of the same length as `nrow(X)`

for ordinary many-label classification, or a logical matrix with each column corresponding to a class for multi-label classification.

iterations

Number of iterations i.e., the number of sub-models built.

depth

The depth of the ferns; must be in 1--16 range. Note that time and memory requirements scale with `2^depth`

.

ferns

Number of ferns to be build in each sub-model. This should be a small number, around 3-5 times `size`

.

size

Number of attributes considered by each sub-model.

lambda

Lambda parameter driving the re-weighting step of the method.

threads

Number of parallel threads, copied to the underlying `rFerns`

call.

saveHistory

Should weight history be stored.

## Value

An object of class `naiveWrapper`

, which is a list with the following components:

foundNames of all selected attributes.

weightsVector of weights indicating the confidence that certain feature is relevant.

timeTakenTime of computation.

weightHistoryHistory of weights over all iterations, present if `saveHistory`

was `TRUE`

.

paramsCopies of algorithm parameters, `iterations`

, `depth`

, `ferns`

and `size`

, as a named vector.

## References

Kursa MB (2017). *Efficient all relevant feature selection with random ferns*. In: Kryszkiewicz M., Appice A., Slezak D., Rybinski H., Skowron A., Ras Z. (eds) Foundations of Intelligent Systems. ISMIS 2017. Lecture Notes in Computer Science, vol 10352. Springer, Cham.

## Examples

# NOT RUN {
set.seed(77)
#Fetch Iris data
data(iris)
#Extend with random noise
noisyIris<-cbind(iris[,-5],apply(iris[,-5],2,sample))
names(noisyIris)[5:8]<-sprintf("Nonsense%d",1:4)
#Execute selection
naiveWrapper(noisyIris,iris$Species,iterations=50,ferns=20,size=8)
# }