twilight.filtering: Permutation filtering

Description

The function call invokes the filtering for permutations of class labels that produce a set of complete null scores. Depending on the test setting, the algorithm iteratively generates valid permutations of the class labels and computes scores. These are transformed to pooled p-values and each set of permutation p-values is tested for uniformity. Permutations with acceptable uniform p-value distributions are kept. The search stops if either the number num.perm of wanted permutations is reached or if the number of possible unique(!) permutations is smaller than num.perm. The default values are similar to function twilight.pval but please note the details below.

Usage

twilight.filtering(xin, yin, method = "fc", paired = FALSE, s0 = 0, verbose = TRUE, num.perm = 1000, num.take = 50)

Arguments

xin

Either an expression set (ExpressionSet) or a data matrix with rows corresponding to features and columns corresponding to samples.

yin

A numerical vector containing class labels. The higher label denotes the case, the lower label the control samples to test case vs. control. For correlation scores, yin can be any numerical vector of length equal to the number of samples.

method

Character string: "fc" for fold change equivalent test (that is log ratio test), "t" for t-test, and "z" for Z-test. With "pearson" or "spearman", the test statistic is either Pearson's correlation coefficient or Spearman's rank correlation coefficient.

paired

Logical value. Depends on whether the samples are paired. Ignored if method="pearson" or method="spearman".

Fudge factor for variance correction in the Z-test. Takes effect only if method="z". If s0=0: The fudge factor is set to the median of the pooled standard deviations.

verbose

Logical value for message printing.

num.perm

Number of permutations. Within twilight.pval, num.perm is set to B.

num.take

Number of permutations kept in each step of the iterative filtering. Within twilight.pval, num.take is set to the minimum of 50 and ceiling(num.perm/20).

Value

Returns a matrix with permuted input labels yin as rows. Please note that this matrix is already translated into binary labels for two-sample testing or to ranks if Spearman's correlation was chosen. The resulting permutation matrix can be directly passed into function twilight.pval. Please note that the first row always contains the original input yin to be consistent with the other permutation functions in package twilight.

Details

See vignette.

References

Scheid S and Spang R (2004): A stochastic downhill search algorithm for estimating the local false discovery rate, IEEE TCBB 1(3), 98--108.

Scheid S and Spang R (2005): twilight; a Bioconductor package for estimating the local false discovery rate, Bioinformatics 21(12), 2921--2922.

Scheid S and Spang R (2006): Permutation filtering: A novel concept for significance analysis of large-scale genomic data, in: Apostolico A, Guerra C, Istrail S, Pevzner P, and Waterman M (Eds.): Research in Computational Molecular Biology: 10th Annual International Conference, Proceedings of RECOMB 2006, Venice, Italy, April 2-5, 2006. Lecture Notes in Computer Science vol. 3909, Springer, Heidelberg, pp. 338-347.

Examples

Run this code

## Not run: 
# ### Leukemia data set of Golub et al. (1999).
# library(golubEsets)
# data(Golub_Train)
# 
# ### Variance-stabilizing normalization of Huber et al. (2002).
# library(vsn)
# golubNorm <- justvsn(Golub_Train)
# 
# ### A binary vector of class labels.
# id <- as.numeric(Golub_Train$ALL.AML)
# 
# ### Do an unpaired t-test.
# ### Let's have a quick example with 50 filtered permutations only.
# ### With num.take=10, we only need 5 iteration steps.
# yperm <- twilight.filtering(golubNorm,id,method="t",num.perm=50,num.take=10)
# dim(yperm)
# 
# ### Let's check that the filtered permutations really produce uniform p-value distributions.
# ### The first row is the original labeling, so we try the second permutation.
# yperm <- yperm[-1,]
# b <- twilight.pval(golubNorm,yperm[1,],method="t",yperm=yperm)
# hist(b$result$pvalue)
# ## End(Not run)

Run the code above in your browser using DataLab