Learn R Programming

SigCheck (version 2.4.0)

sigCheckPermuted: Check signature performance against performance on randomly permuted data.

Description

Performance of a signature on intact data is compared to performance on permuted data and/or metadata. Data may be permuted by by feature (expression values of each feature permuted across samples), samples (expression values of all features permuted within each sample). Metadata may be permuted by categories (permuted assignment of samples to classification categories) or survival (permuted assignment of survival times to samples).

Usage

sigCheckPermuted(check, toPermute="categories", iterations=10)

Arguments

check
A SigCheckObject, as returned by sigCheck.
toPermute
Character string or vector of strings indicating what should be permuted. Allowable values:
  • "features": the expression values for each feature will be permuted (permutation by row).

  • "samples": the expression values for each sample will be permuted (permutation by column).

  • "survival": the values in survival will be permuted.

  • "categories": the values in classes will be permuted.
iterations
The number of permuted dataset the primary signature will be compared to. This should be at least 1,000 to compute a meaningful empirical p-value for comparative performance.

Value

A result list with the following elements:
  • $checkType is equal to "Permuted".
  • $permute is equal to the passed value of toPermute.
  • $tests is the number of tests run (equal to iterations.)
  • $rank is the performance rank of the signature on the intact dataset compared to its performance in the permuted datasets.
  • $checkPval is the empirical p-value computed using the performance scores of the signature on permuted datasets as a null distribution. A value of zero indicates that the signature did not perform better on any permuted datasets than it does using the intact data.
  • $survivalPval represents the performance of the primary signature on the original dataset if survival data were provided.
  • $survivalPvalsPermuted is a vector of performance scores (p-values) for each permuted dataset, if survival data were provided.
  • $trainingPvalsPermuted is a vector of performance scores (p-values) for each permuted dataset, if survival data and separate validation samples were provided.
  • $sigPerformance is the proportion of validation samples correctly classified using the intact dataset if a classifier was used.
  • $modePerformance is the proportion of validation samples correctly classified in the intact dataset using a mode classifier.
  • $performancePermuted is a vector of classification performance scores for each permuted dataset, each indicating the proportion of validation samples correctly classified if a classifier was used.

Details

The primary signature will be evaluated against each permuted dataset in the same manner as for the intact dataset.

If survival data were supplied, a survival analysis will be carried out on the validation samples, and a p-value computed as a performance measure. If no survival data are available, the training samples will be used to train a classifier, and the performance score will be percentage of validation samples correctly classified. (If no validation samples are provided, leave-one-out cross validation will be used to calculate the classification performance for each permuted dataset).

An empirical p-value will be computed based on the percentile rank of the performance of the signature on the intact dataset compared to a null distribution of the performance of the signature on all the permuted datasets.

See Also

sigCheck, sigCheckAll, sigCheckRandom, sigCheckKnown

Examples

Run this code
#Disable parallel so Bioconductor build won't hang
library(BiocParallel)
register(SerialParam())

library(breastCancerNKI)
data(nki)
nki <- nki[,!is.na(nki$e.dmfs)]
data(knownSignatures)
ITERATIONS <- 5 # should be at least 20, 1000 for real checks

## survival analysis
check <- sigCheck(nki, classes="e.dmfs", survival="t.dmfs",
                  signature=knownSignatures$cancer$VANTVEER,
                  annotation="HUGO.gene.symbol",
                  validationSamples=150:319)

par(mfrow=c(1,2))
permutedCategories <- sigCheckPermuted(check, toPermute="categories", 
                                       iterations=ITERATIONS)
permutedCategories$checkPval
sigCheckPlot(permutedCategories)
permutedSurvival <- sigCheckPermuted(check, toPermute="survival", 
                                     iterations=ITERATIONS)
permutedSurvival$checkPval
sigCheckPlot(permutedSurvival)

Run the code above in your browser using DataLab