- count_file
Normalized (i.e. TPM, RPKM, CPM) RNA-seq count matrix where rows are gene symbols and columns are individuals. Inputted data should be a data.frame or matrix. A character vector to a tsv file where this data can be loaded is also acceptable. Gene symbols from the count file, signature matrix, and DEG list should all match (case sensitive, gene symbol or ensembl, etc.)
- signature_matrix
Signature matrix: a gene by cell-type matrix populated with the fold-change of gene expression in cell-type marker "i" vs all other cell-types. Object should be a data.frame or matrix.
- DEG_list
An object with the first column as gene symbols within the bulk dataset (doesn't have to be in signature matrix), second column is the adjusted p-value, and the third the log2FC path to a .tsv file containing this info is also acceptable.
- case_grep
A character representing what designates the "cases" (i.e. upregulated is 'case' biased) in the columns of the count file. A numeric vector of the index of "cases" is also acceptable. Tag in the column name for cases (i.e. samples representing upregulated) OR an index of cases.
- control_grep
A character representing what designates the "control" (i.e. downregulated is 'control biased) in the columns of the count file. A numeric vector of the index of "control" is also acceptable. Tag in the column name for cases (i.e. samples representing upregulated) OR an index of cases.
- rda_path
If downloaded, path to where data from scMappR_data is stored.
- max_proportion_change
Maximum cell-type proportion change -- may be useful if there are many rare cell-type. Alternatively, if a cell-type is only present in one condition but not the other, it will prevent possible infinite or 0 cwFold-changes.
- print_plots
Whether boxplots of the estimated CT proportion for the leave-one-out method of CT deconvolution should be printed. The same name of the plots will be completed for top pathways.
- plot_names
The prefix of plot pdf files.
- theSpecies
human, mouse, or a species directly compatible with gProfileR (i.e. g:ProfileR).
- output_directory
The name of the directory that will contain output of the analysis.
- sig_matrix_size
Maximum number of genes in signature matrix for cell-type deconvolution.
- drop_unknown_celltype
Whether or not to remove "unknown" cell-types from the signature matrix.
- internet
Whether you have stable Wifi (T/F).
- up_and_downregulated
Whether you are additionally splitting up/downregulated genes (T/F).
- gene_label_size
The size of the gene label on the plot.
- number_genes
The number of genes to cut-off for pathway analysis (good with many DEGs).
- toSave
Allow scMappR to write files in the current directory (T/F).
- newGprofiler
Whether to use gProfileR or gprofiler2 (T/F).
- path
If toSave == TRUE, path to the directory where files will be saved.
- deconMethod
Which RNA-seq deconvolution method to use to estimate cell-type proporitons. Options are "WGCNA", "DCQ", or "DeconRNAseq"
- rareCT_filter
option to keep cell-types rarer than 0.1 percent of the population (T/F). Setting to FALSE may lead to false-positives.