ewce_expression_data: Bootstrap celltype enrichment test for transcriptome data

Description

ewce_expression_data takes a differential expression table and determines the probability of cell-type enrichment in the up & down regulated genes

Usage

ewce_expression_data(sct_data, tt, sortBy = "t", thresh = 250, reps = 100, sub = FALSE, useHGNC = TRUE)

Arguments

sct_data

List generated using read_celltype_data

Differential expression table. Can be output of limma::topTable function. Minimum requirement is that one column stores a metric of increased/decreased expression (i.e. log fold change, t-statistic for differential expression etc) and another contains either HGNC or MGI symbols.

sortBy

Column name of metric in tt which should be used to sort up- from down- regulated genes. Default="t"

thresh

The number of up- and down- regulated genes to be included in each analysis. Dafault=250

reps

Number of random gene lists to generate (default=100 but should be over 10000 for publication quality results)

sub

a logical indicating whether to analyse sub-cell type annotations (TRUE) or cell-type annotations (FALSE). Default is FALSE.

useHGNC

a logical indicating whether HGNC or MGI gene symbols are provided. Dafault=TRUE

Value

A list containing five data frames:

results: dataframe in which each row gives the statistics (p-value, fold change and number of standard deviations from the mean) associated with the enrichment of the stated cell type in the gene list. An additional column *Direction* stores whether it the result is from the up or downregulated set.
hit.cells.up: vector containing the summed proportion of expression in each cell type for the target list
hit.cells.down: vector containing the summed proportion of expression in each cell type for the target list#'
bootstrap_data.up: matrix in which each row represents the summed proportion of expression in each cell type for one of the random lists
bootstrap_data.down: matrix in which each row represents the summed proportion of expression in each cell type for one of the random lists

Examples

Run this code

# Load the single cell data
data(celltype_data)

# Set the parameters for the analysis
reps=100 # <- Use 100 bootstrap lists so it runs quickly, for publishable analysis use >10000
subCellStatus=0 # <- Use subcell level annotations (i.e. Interneuron type 3)
if(subCellStatus==1){subCellStatus=TRUE;cellTag="SubCells"}
if(subCellStatus==0){subCellStatus=FALSE;cellTag="FullCells"}

# Load the gene list and get human orthologs
data("tt_alzh")

# Bootstrap significance testing, without controlling for transcript length and GC content
tt_results = ewce_expression_data(sct_data=celltype_data,tt=tt_alzh)

Run the code above in your browser using DataLab