Learn R Programming

biogram (version 1.1)

test_features: Permutation test for feature selection

Description

Performs a feature selection on positioned n-gram data using a Fisher's permutation test.

Usage

test_features(target, features, criterion = "ig", adjust = "BH",
  threshold = 1, quick = TRUE, times = 1e+05)

Arguments

target
integer vector with target information (e.g. class labels).
features
integer matrix of features with number of rows equal to the length of the target vector.
criterion
criterion used in permutation test. See criterions for the list of possible criterions.
adjust
name of p-value adjustment method. See p.adjust for the list of possible values. If NULL, no adjustment is done.
threshold
integer. Features that occur less than threshold and more often than nrow(features)-threshold are discarded from the permutation test.
quick
logical, if TRUE Quick Permutation Test (QuiPT) is used.
times
number of times procedure should be repeated. Ignored if quick is TRUE.

Value

code

adjust

Details

Currently implemented criterions:
  • "ig" - information gain

References

Radivojac P, Obradovic Z, Dunker AK, Vucetic S, Feature selection filters based on the permutation test in Machine Learning: ECML 2004, 15th European Conference on Machine Learning, Springer, 2004.

See Also

See criterion_distribution for insight on QuiPT.

summary.feature_test - summary of results.

cut.feature_test - aggregates test results in groups based on feature's p-value.

Examples

Run this code
#significant feature
tar_feat1 <- create_feature_target(10, 390, 0, 600)
#significant feature
tar_feat2 <- create_feature_target(9, 391, 1, 599)
#insignificant feature
tar_feat3 <- create_feature_target(198, 202, 300, 300)
test_res <- test_features(tar_feat1[, 1], cbind(tar_feat1[, 2], tar_feat2[, 2], tar_feat3[, 2]))
summary(test_res)
cut(test_res)

Run the code above in your browser using DataLab