Seurat (version 2.3.1)

JackStraw: Determine statistical significance of PCA scores.

Description

Randomly permutes a subset of data, and calculates projected PCA scores for these 'random' genes. Then compares the PCA scores for the 'random' genes with the observed PCA scores to determine statistical signifance. End result is a p-value for each gene's association with each principal component.

Usage

JackStraw(object, num.pc = 20, num.replicate = 100, prop.freq = 0.01,
  display.progress = TRUE, do.par = FALSE, num.cores = 1, maxit = 1000)

Arguments

object

Seurat object

num.pc

Number of PCs to compute significance for

num.replicate

Number of replicate samplings to perform

prop.freq

Proportion of the data to randomly permute for each replicate

display.progress

Print progress bar showing the number of replicates that have been processed.

do.par

use parallel processing for regressing out variables faster. If set to TRUE, will use half of the machines available cores (FALSE by default)

num.cores

If do.par = TRUE, specify the number of cores to use. Note that for higher number of cores, larger free memory is needed. If num.cores = 1 and do.par = TRUE, num.cores will be set to half of all available cores on the machine.

maxit

maximum number of iterations to be performed by the irlba function of RunPCA

Value

Returns a Seurat object where object@dr$pca@jackstraw@emperical.p.value represents p-values for each gene in the PCA analysis. If ProjectPCA is subsequently run, object@dr$pca@jackstraw@emperical.p.value.full then represents p-values for all genes.

References

Inspired by Chung et al, Bioinformatics (2014)

Examples

# NOT RUN {
pbmc_small = suppressWarnings(JackStraw(pbmc_small))
head(pbmc_small@dr$pca@jackstraw@emperical.p.value)
# }
# NOT RUN {
# }