Learn R Programming

GSNA (version 0.1.4.2)

scoreJaccardMatrix_C: scoreJaccardMatrix_C

Description

Takes a presence/absence matrix with genes as the rows and modules as columns and calculates a matrix of Jaccard index values.

Usage

scoreJaccardMatrix_C(geneSetCollection_m)

Value

This function returns a matrix of Jaccard index values between gene modules. Values on the diagonal corresponding to self-Jaccard indices are returned as NA.

Arguments

geneSetCollection_m

(required) A logical presence/absence matrix representation of a gene set collection in which columns correspond to gene sets, rows correspond to genes and values are TRUE if a gene is present in a gene set and FALSE otherwise. Row and column names correspond to gene symbols and gene set identifiers, respectively. NOTE: for a typical GSNA analysis, this matrix would include only observed filtered genes and significant gene set hits from pathways analysis. Using a matrix version of the full MSigDB without filtering genes, for example, would likely be unworkably slow and memory intensive.

Details

The Jaccard index J for two sets A and B is defined as:

$$ J(A,B) = \dfrac{\lvert A \cap B \rvert}{\lvert A \cup B \rvert} $$

See Also

buildGeneSetNetworkJaccard() scoreLFMatrix_C()

@import Rcpp

Examples

Run this code

library(GSNA)

# Get the background of observable genes set from
# expression data:
gene_background <- toupper(rownames( Bai_empty_expr_mat ))

# Using the sample gene set collection **Bai_gsc.tmod**,
# generate a gene presence-absence matrix filtered for the
# ref.background of observable genes:
presence_absence.mat <-
 makeFilteredGenePresenceAbsenceMatrix( ref.background = gene_background,
                                        geneSetCollection = Bai_gsc.tmod )

jaccard.mat <- scoreJaccardMatrix_C( presence_absence.mat )

Run the code above in your browser using DataLab