Learn R Programming

EWCE (version 1.0.2)

bootstrap.enrichment.test: Bootstrap celltype enrichment test

Description

bootstrap.enrichment.test takes a genelist and a single cell type transcriptome dataset and determines the probability of enrichment and fold changes for each cell type.

Usage

bootstrap.enrichment.test(sct_data = NA, mouse.hits = NA, mouse.bg = NA, human.hits = NA, human.bg = NA, reps = 100, sub = FALSE, geneSizeControl = FALSE)

Arguments

sct_data
List generated using read_celltype_data
mouse.hits
Array of MGI gene symbols containing the target gene list. Not required if geneSizeControl=TRUE
mouse.bg
Array of MGI gene symbols containing the background gene list. Not required if geneSizeControl=TRUE
human.hits
Array of HGNC gene symbols containing the target gene list. Not required if geneSizeControl=FALSE
human.bg
Array of HGNC gene symbols containing the background gene list. Not required if geneSizeControl=FALSE
reps
Number of random gene lists to generate (default=100 but should be over 10000 for publication quality results)
sub
a logical indicating whether to analyse sub-cell type annotations (TRUE) or cell-type annotations (FALSE). Default is FALSE.
geneSizeControl
a logical indicating whether you want to control for GC content and transcript length. Recommended if the gene list originates from genetic studies. Default is FALSE. If set to TRUE then human gene lists should be used rather than mouse.

Value

A list containing three data frames:
  • results: dataframe in which each row gives the statistics (p-value, fold change and number of standard deviations from the mean) associated with the enrichment of the stated cell type in the gene list
  • hit.cells: vector containing the summed proportion of expression in each cell type for the target list
  • bootstrap_data: matrix in which each row represents the summed proportion of expression in each cell type for one of the random lists

Examples

Run this code
# Load the single cell data
data(celltype_data)

# Set the parameters for the analysis
reps=100 # <- Use 100 bootstrap lists so it runs quickly, for publishable analysis use >10000
subCellStatus=0 # <- Use subcell level annotations (i.e. Interneuron type 3)
if(subCellStatus==1){subCellStatus=TRUE;cellTag="SubCells"}
if(subCellStatus==0){subCellStatus=FALSE;cellTag="FullCells"}

# Load the gene list and get human orthologs
data("example_genelist")
data("mouse_to_human_homologs")
m2h = unique(mouse_to_human_homologs[,c("HGNC.symbol","MGI.symbol")])
mouse.hits = unique(m2h[m2h$HGNC.symbol %in% example_genelist,"MGI.symbol"])
human.hits = unique(m2h[m2h$HGNC.symbol %in% example_genelist,"HGNC.symbol"])
human.bg = unique(setdiff(m2h$HGNC.symbol,human.hits))
mouse.bg  = unique(setdiff(m2h$MGI.symbol,mouse.hits))

# Bootstrap significance testing, without controlling for transcript length and GC content
full_results = bootstrap.enrichment.test(sct_data=celltype_data,mouse.hits=mouse.hits,
  mouse.bg=mouse.bg,reps=reps,sub=subCellStatus)

# Bootstrap significance testing controlling for transcript length and GC content
full_results = bootstrap.enrichment.test(sct_data=celltype_data,human.hits=human.hits,
  human.bg=human.bg,reps=reps,sub=subCellStatus,geneSizeControl=TRUE)

Run the code above in your browser using DataLab