This function takes the factorized \(W\) matrix, with gene loading in
factors, to get the ranked gene list for each factor. Then it runs simply
implemented GSEA against given gene sets. So if genes in the given gene set
are top loaded in a factor, this function will return high positive
enrichment score (ES) as well as significant p-value.
For the returned result object, use print()
or summary()
to
show concise results, and use plot()
to visualize the GSEA statistics.
This function can be useful in various scenarios:
For example, when clusters with strong cell cycle activity are detected,
users can apply this function with cell cycle gene sets to identify if any
factor is enriched with such genes. Then in the downstream when aligning the
iNMF factor loadings, users can simply opt to exclude these factors so the
variation in cell cycle is regressed out. Objects cc.gene.human
and
cc.gene.mouse
are deliverered in package for convenience.
In other cases, this function can also be used to understand the biological
meaning of each cluster. Since the downstream clustering result is largely
determined by the top loaded factor in each cell, understanding what
genes are loaded in the top factor helps understand the identity and activity
of the cell. This will require users to have there own gene sets prepared.