sigCheckClassifier(expressionSet, classes, signature, annotation, validationSamples, classifierMethod = svmI, ...)
ExpressionSet
object containing the data to be checked,
including an expression matrix, feature labels, and samples.
varLabels(expressionSet)
). There should be only two
unique values in expressionSet$classes
.
annotation
parameter (default is row names in the expressionSet). Alternatively, this can be a integer vector of feature indexes.
featureData
field should be
used as the annotation. If missing, the row names of the expressionSet are used as the feature names.
expressionSet
should used for validation. If present, a classifier will
be trained, using the specified signature and classification method, on the
non-validation samples, and it's performance evaluated by attempting to
classify the validations samples. If missing, a leave-one-out (LOO) validation
method will be used, where a separate classifier will be trained to classify
each sample using the remaining samples.
MLearn
in support of the
classification method specified in classifierMethod
.
$sigPerformance
is the percentage of validationSamples correctly
classified (or, in the LOO case, the percentage of total samples correctly
classified by classifiers trained using the remaining samples.)$confusion
is a confusion matrix in the form of a table showing
how many samples in each class were correctly or incorrectly classified,
corresponding to True Positives, True Negative, False Positives,
and False Negatives.$modePerformance
is the percentage of validationSamples correctly
classified by a "mode" classifier (or, in the LOO case, the percentage of total
samples correctly classified by a "mode" classifier, which is equal the number
of samples with the more-frequent category.) The "mode" classifier always
predicts the category that appears most often in the training set.
If the training set is balanced between categories, one category will
always be predicted.
validationSamples
are specified, the MLInterfaces
package is
used to train a classifier on the remaining samples. By default, a
Support Vector Machine classifier is used, but any machine learning approach
supported by MLearn
can be specified. Baseline performance is
measured by the percentage of the validation samples classified correctly
(a confusion matrix of the results is also returned). If the validationSamples
are not specified, a leave-one-out (LOO) approach is deployed, whereby each
sample in turn is used as the validation sample, resulting in as many
classifiers being trained as there are samples.
sigCheck
, sigCheckRandom
,
sigCheckPermuted
, sigCheckKnown
,
MLearn
library(breastCancerNKI)
data(nki)
nki <- nki[,!is.na(nki$e.dmfs)]
data(knownSignatures)
results <- sigCheckClassifier(nki, classes="e.dmfs",
signature=knownSignatures$cancer$VANTVEER,
annotation="HUGO.gene.symbol")
Run the code above in your browser using DataLab