A collection of functions to select features.
fsSample(object, top = 0, ...)fsNULL(object, top = 0, ...)
fsANOVA(object, top = 0, ...)
fsInclude(object, top = 0, include)
fsStats(object, top = 0, ...)
fsPrcomp(object, top = 0, ...)
fsPathClassRFE(object, top = 0, ...)
fsEbayes(object, top = 0, ...)
fsMrmre(object, top = 0, ...)
# S4 method for ExprsArray
fsSample(object, top = 0, ...)
# S4 method for ExprsArray
fsNULL(object, top = 0, ...)
# S4 method for ExprsBinary
fsInclude(object, top = 0, include)
# S4 method for ExprsArray
fsANOVA(object, top = 0, ...)
# S4 method for ExprsBinary
fsStats(object, top = 0, how = c("t.test",
"ks.test"), ...)
# S4 method for ExprsBinary
fsPrcomp(object, top = 0, ...)
# S4 method for ExprsBinary
fsPathClassRFE(object, top = 0, ...)
# S4 method for ExprsBinary
fsEbayes(object, top = 0, ...)
# S4 method for ExprsBinary
fsMrmre(object, top = 0, ...)
Specifies the ExprsArray
object to undergo feature selection.
A numeric scalar or character vector. A numeric scalar indicates
the number of top features that should undergo feature selection. A character vector
indicates specifically which features by name should undergo feature selection.
Set top = 0
to include all features. A numeric vector can also be used
to indicate specific features by location, similar to a character vector.
Arguments passed to the respective wrapped function.
A character vector. The names of features to rank above all others.
This preserves the feature order otherwise. Argument for fsInclude
only.
A character string. Toggles between the sub-routines "t.test" and
"ks.test". Argument for fsStats
only.
Returns an ExprsArray
object.
fsSample:
Method to perform random feature selection using base::sample.
fsNULL:
Method to perform a NULL feature selection and return input unaltered.
fsInclude:
Method to rank explicitly stated features above all others.
fsANOVA:
Method to perform ANOVA feature selection using stats::aov.
fsStats:
Method to perform statistics based feature selection using stats::t.test and others.
fsPrcomp:
Method to perform principal components analysis using stats::prcomp.
fsPathClassRFE:
Method to perform SVM-RFE feature selection using pathClass::fit.rfe.
fsEbayes:
Method to perform empiric Bayes feature selection using limma::ebayes.
fsMrme:
Method to perform mRMR feature selection using mRMRe::mRMR.classic.
Considering the high-dimensionality of most genomic datasets, it is prudent and often necessary
to prioritize which features to include during classifier construction. Although there exists
many feature selection methods, this package provides wrappers for some of the most popular ones.
Each wrapper (1) pre-processes the ExprsArray
input, (2) performs the feature selection,
and (3) returns an ExprsArray
output with an updated feature selection history.
You can use, in tandem, any number of feature selection methods, and in any order.
For all feature selection methods, @preFilter
and @reductionModel
stores the
feature selection and dimension reduction history, respectively. This history gets passed
along to prepare the test or validation set during model deployment, ensuring that these
sets undergo the same feature selection and dimension reduction as the training set.
Under the scenarios where users plan to apply multiple feature selection or dimension
reduction steps, the top
argument manages which features (e.g., gene expression values)
to send through each feature selection or dimension reduction procedure. For top
,
a numeric scalar indicates the number of top features to use, while a character vector
indicates specifically which features to use. In this way, the user sets which features
to feed INTO the fs
method (NOT which features the user expects OUT). The example
below shows how to apply dimension reduction to the top 50 features as selected by the
Student's t-test. Set top = 0
to pass all features through an fs
method.
Note that not all feature selection methods will generalize to multi-class data.
A feature selection method will fail when applied to an ExprsMulti
object
unless that feature selection method has an ExprsMulti
method.
Note that fsMrmre
crashes when supplied a very large feature_count
argument
owing to its implementation in the imported package mRMRe
.
fs
build
doMulti
exprso-predict
plCV
plGrid
plGridMulti
plMonteCarlo
plNested
# NOT RUN {
library(golubEsets)
data(Golub_Merge)
array <- arrayEset(Golub_Merge, colBy = "ALL.AML", include = list("ALL", "AML"))
array <- modFilter(array, 20, 16000, 500, 5) # pre-filter Golub ala Deb 2003
array <- modTransform(array) # lg transform
array <- modNormalize(array, c(1, 2)) # normalize gene and subject vectors
arrays <- splitSample(array, percent.include = 67)
array.train <- fsStats(arrays[[1]], top = 0, how = "t.test")
array.train <- fsPrcomp(array.train, top = 50)
mach <- buildSVM(array.train, top = 5, kernel = "linear", cost = 1)
# }
Run the code above in your browser using DataCamp Workspace