Learn R Programming

SigCheck (version 1.0.2)

sigCheck: Check classification potential of a gene signature against randomly selected gene signatures, known gene signatures, and permuted expression sets.

Description

High-level function for package SigCheck that runs all available checks against a classification signature.

Usage

sigCheck(expressionSet, classes, signature, annotation, validationSamples, classifierMethod = svmI, nIterations = 10, knownSignatures="cancer", plotResults=TRUE)

Arguments

expressionSet
An ExpressionSet object containing the data to be checked, ncluding an expression matrix, feature labels, and samples.
classes
Specifies which label is to be used to determine the classification categories (must be one of varLabels(expressionSet)). There should be only two unique values in expressionSet$classes.
signature
A vector of feature labels specifying which features comprise the signature to be checked. These feature labels should match values as specified in the annotation parameter (default is row names in the expressionSet). Alternatively, this can be a integer vector of feature indexes.
annotation
Character string specifying which featureData field should be used as the annotation. If missing, the row names of the expressionSet are used as the feature names.
validationSamples
Optional specification, as a vector of sample indices, of what samples in the expressionSet should be used for validation. If present, a classifier will be trained, using the specified signature and classification method, on the non-validation samples, and it's performance evaluated by attempting to classify the validations samples. If missing, a leave-one-out (LOO) validation method will be used, where a separate classifier will be trained to classify each sample using the reaming samples.
classifierMethod
The MLInterfaces learnerSchema object indicating the machine learning method to use for classification. Default is svmI for linear Support Vector Machine classification. See MLearn for available methods.
nIterations
For random gene and permutation tests, the number of iterations to run to compare classification outcomes.
knownSignatures
Either a character string specifying which set of signatures to use from the included sets in knownSignatures, or a list of previously identified signatures to compare performance against. Each element in the list should be a vector of feature labels. Default is to use the "cancer" signatures from the included knownSignatures data set, taken from Venet et. al.
plotResults
if TRUE, will call sigCheckPlot four times to plot the results of all checks (laid out in a 2x2 plot matrix).

Value

A list containing five elements:
  • $checkClassifier is the result list returned by sigCheckClassifier.
  • $checkRandom is the result list returned by sigCheckRandom.
  • $checkKnown is the result list returned by sigCheckKnown.
  • $checkPermutedFeatures is the result list returned by sigCheckPermuted with toPermute="features".
  • $checkPermutedCategories is the result list returned by sigCheckPermuted with toPermute="categories".

Details

First, sigCheck calls sigCheckClassifier to establish the baseline performance of the signature being checked.

Next, it calls sigCheckRandom to check the performance of randomly selected signatures.

This is followed by a call to sigCheckKnown to check the performance of the signature against a database of signatures previously identified to discriminate in other, generally unrelated domains.

Finally, two calls are made to sigCheckPermuted to check the performance of randomly permuted data; the first call permutes the rows (toPermute="features"), while the second call permutes the categories (toPermute="categories").

References

Venet, David, Jacques E. Dumont, and Vincent Detours. "Most random gene expression signatures are significantly associated with breast cancer outcome." PLoS Computational Biology 7.10 (2011): e1002240.

See Also

sigCheckClassifier, sigCheckRandom, sigCheckPermuted, sigCheckKnown, MLearn

Examples

Run this code
library(breastCancerNKI)
data(nki)
nki <- nki[,!is.na(nki$e.dmfs)]
data(knownSignatures)
results <- sigCheck(nki, classes="e.dmfs", 
                    annotation="HUGO.gene.symbol",
                    signature=knownSignatures$cancer$VANTVEER, 
                    validationSamples=275:319, nIterations=5)

Run the code above in your browser using DataLab