GSNData object constructor, used to generate new GSNData objects.
GSNData(distances = list(), ...)
A new GSNAData object.
Optional parameter containing a list of module-module distance metric data organized by the name of the distance metric used, e.g. lf, jaccard, stlf.
Additional arguments. Object fields can be set as named arguments this way using name = value pairs.
The GSNData object can contain multiple distance matrices including log Fisher (lf) and Jaccard (jaccard). These distances,
along with associated pared-distances, and significance order parameters are stored in named sublists within the $distances
lists, the sublists are named after their respective distance metric (lf, jaccard, etc.) as $distances[[DIST]]
. These sublists
contain a distance matrix $distances[[DIST]]$matrix
, a significance order $distances[[DIST]]$optimal_extreme
(e.g. "loToHi" for lf,
and "hiToLo" for jaccard), and after paring a $distances[[DIST]]$pared
.
$GSNA_version
A character vector of length 1 indicating the version of GSNA used to generate this GSNData object.
$genePresenceAbsence
A sparse logical Matrix containing presence(TRUE) or absence (FALSE) calls for genes (rows) in gene sets (columns).
$distances
a named list(). Names indicate a distance metric 'lf', 'jaccard', etc. indicated as DIST
below.
$distances[[DIST]]$matrix
A matrix of raw distances
$distances[[DIST]]$optimal_extreme
Significance order where "min" indicates that low values are optimal/ closer than high values as with log Fisher (lf), and "max" indicates that high values are closer, as with Jaccard (jaccard) distance.
$distances[[DIST]]$pared_optimal_extreme
Significance order for the pared, scaled distance matrix.
This may differ from $distances[[DIST]]$optimal_extreme
if scaling flips high
distance values to low ones, as may be necessary for handling distance matrices such
as the Jaccard for which higher values are closer. (See
$distances[[DIST]]$optimal_extreme
, above.)
$distances[[DIST]]$pared
A pared distance matrix.
$distances[[DIST]]$edges
A data.frame containing a gathered set of network edges derived from $distances[[DIST]]$pared
$distances[[DIST]]$vertices
A complete list of gene set IDs in the network.
$default_distance
The default distance used for network construction.
$ordered_genes
A character vector containing the ordered list of genes in the data set (most important first). This list is also used as the background of observable genes for creating the filteredGeneSetCollection.
$filteredGeneSetCollection
A filtered set of gene lists (a list of character vectors) containing only the genes present in the differential expression data set. This is the 'background' of all genes observable in the differential expression data.
$pathways
A named list containing pathways results data, as follows:
$pathways$data
A data.frame containing a pathways results set.
$pathways$type
A character vector of length=1 indicating the type of pathways analysis performed, e.g. CERNO, GSEA, ORA.
$pathways$id_col
Indicates the name of the column in $pathways$data that contains the gene set ID.
$pathways$stat_col
A character vector of length 1 indicating the statistic used for assessing significance, generally a p-value.
$pathways$stat_col_2
A character vector of length 1 indicating the statistic used for assessing significance, generally a p-value.
$pathways$sig_order
Indicates whether low of high values of $pathways$statistic are most significant with "loToHi" indicating that low values are optimal/most significant (as with typical p-values) and "hiToLo" indicating high values are optimal/most significant.
$pathways$sig_order_2
Indicates whether low of high values of $pathways$statistic are most significant with "loToHi" indicating that low values are optimal/most significant (as with typical p-values) and "hiToLo" indicating high values are optimal/most significant.
$pathways$n_col
Indicates the name of the pathways column used to indicate effective gene set size, based on genes actually observable in an experimental data set.
This method is called by buildGeneSetNetworkLFFFast()
, buildGeneSetNetworkLFFast()
and buildGeneSetNetworkSTLF()
.
For most users there will be little reason to call this method except when tying to implement support for new distance metrics or
utility functions.
library(GSNA)
gsn_obj <- GSNData()
Run the code above in your browser using DataLab