SISPA: SISPA

Description

SISPA: Method for Sample Integrated Gene Set Analysis

Usage

SISPA(feature=1,f1.df,f1.genes,f1.profile,f2.df,f2.genes,f2.profile,cpt_data="var",cpt_method="BinSeg",cpt_max=60)

Arguments

feature

Number of input feature or data types

f1.df

A data matrix of first feature (e.g., gene or probe expression values) where rows corrospond to genes and columns corrospond to samples

f1.genes

Gene sets for first feature provided as a vector or data frame

f1.profile

A flag to specify gene profile. If gene.profile="up" then samples with increased zscores are identified. If gene.profile="down" then samples with decreased zscores are identified. Default is "up".

f2.df

A data matrix of second feature (e.g., gene variant change) where rows corrospond to genes and columns corrospond to samples

f2.genes

Gene sets for second feature provided as a vector or data frame

f2.profile

A flag to specify gene profile. If gene.profile="up" then samples with increased zscores are identified. If gene.profile="down" then samples with decreased zscores are identified. Default is "up".

cpt_data

Identify changepoints for data using variance (cpt.var) or mean (cpt.mean). Default is cpt.var.

cpt_method

Choice of single or multiple changepoint model. Default is "BinSeg". See changepoint R package for details

cpt_max

The maximum number of changepoints to search for using "BinSeg" method. Default is 60.

Value

The input molecular data frame with added sample identifiers and estimated changepoints. A plot showing the changepoint locations estimated on the data. Bar plots pdf illustrating distinct distribution of samples with and without profile activity

Details

Sample Integrated Gene Set Analysis (SISPA) is a method designed to define sample groups with similar gene set enrichment profiles. The user specifies a gene list of interest and sample by gene molecular data (expression, methylation, variant, or copy change data) to obtain gene set enrichment scores by each sample. The score statistics is rank ordered by the desired profile (e.g., upregulated or downregulated) for samples. A change point model is then applied to the sample scores to identify groups of samples that show similar gene set profile patterns. Samples are ranked by desired profile activity score and grouped by samples with and without profile activity. Figure 1 shows the schematic representation of the SISPA method overview.

Examples

Run this code

g <- 10 ## number of genes
s <- 60 ## number of samples
## sample data matrix with values ranging from 1 to 10
rnames <- paste("g", 1:g, sep="")
cnames <- paste("s", 1:s, sep="")
expr <- matrix(sample.int(10, size = g*s, replace = TRUE), nrow=g, ncol=s, dimnames=list(rnames, cnames))
## genes of interest
genes <- data.frame(paste("g", 1:6, sep=""))
SISPA(feature=1,f1.df=expr,f1.genes=genes,f1.profile="up")

Run the code above in your browser using DataLab