Learn R Programming

daltoolbox (version 1.3.727)

feature_selection_info_gain: Feature selection by information gain

Description

Rank and select features using information gain with optional discretization.

Usage

feature_selection_info_gain(
  attribute,
  features = NULL,
  top = NULL,
  cutoff = 0,
  bins = 3
)

Value

returns an object of class feature_selection_info_gain

Arguments

attribute

target attribute name

features

optional vector of feature names (default: all columns except attribute)

top

optional number of top features to keep

cutoff

minimum information gain to keep a feature (default: 0)

bins

number of quantile bins for numeric features

Details

Numeric predictors are discretized by quantile bins before computing entropy-based information gain.

Examples

Run this code
data(iris)
fg <- feature_generation(
 IsVersicolor = ifelse(Species == "versicolor", "versicolor", "not_versicolor")
)
iris_bin <- transform(fg, iris)
iris_bin$IsVersicolor <- factor(iris_bin$IsVersicolor)
fs <- feature_selection_info_gain("IsVersicolor", top = 2)
fs <- fit(fs, iris_bin)
fs$selected
iris_fs <- transform(fs, iris_bin)
names(iris_fs)

Run the code above in your browser using DataLab