Learn R Programming

rliger (version 2.2.1)

factorGSEA: Test all factors for enrichment in a gene set

Description

This function takes the factorized \(W\) matrix, with gene loading in factors, to get the ranked gene list for each factor. Then it runs simply implemented GSEA against given gene sets. So if genes in the given gene set are top loaded in a factor, this function will return high positive enrichment score (ES) as well as significant p-value.

For the returned result object, use print() or summary() to show concise results, and use plot() to visualize the GSEA statistics.

This function can be useful in various scenarios:

For example, when clusters with strong cell cycle activity are detected, users can apply this function with cell cycle gene sets to identify if any factor is enriched with such genes. Then in the downstream when aligning the iNMF factor loadings, users can simply opt to exclude these factors so the variation in cell cycle is regressed out. Objects cc.gene.human and cc.gene.mouse are deliverered in package for convenience.

In other cases, this function can also be used to understand the biological meaning of each cluster. Since the downstream clustering result is largely determined by the top loaded factor in each cell, understanding what genes are loaded in the top factor helps understand the identity and activity of the cell. This will require users to have there own gene sets prepared.

Usage

factorGSEA(
  object,
  geneSet,
  nPerm = 1000,
  seed = 1,
  verbose = getOption("ligerVerbose", TRUE)
)

Value

If geneSet is a single character vector, returns a data frame with enrichment score (ES), normalized enrichment score (NES), and p-value for the test in each factor. If geneSet is a list, returns a list of such data frames.

Arguments

object

A liger object with factorized \(W\) matrix available.

geneSet

A character vector for a single gene set, or a list of character vectors for multiple gene sets.

nPerm

Integer number for number of permutations to estimate p-value. Default 1000.

seed

Integer number for random seed. Default 1. Set to NULL to not set seed.

verbose

Logical, whether to print progress bar. Default getOptions('ligerVerbose') otherwise TRUE.

Examples

Run this code
# \donttest{
pbmc <- pbmc %>%
    selectBatchHVG() %>%
    scaleNotCenter() %>%
    runINMF()
factorGSEAres <- factorGSEA(pbmc, ccGeneHuman)
# Print summary of significant results
print(factorGSEAres)
summary(factorGSEAres)
# Make GSEA plot for certain gene set and factor
plot(factorGSEAres, geneSetName = 'g2m.genes', useFactor = 'Factor_1')
# }

Run the code above in your browser using DataLab