Calculate number of UMIs, number of detected features and percentage of feature subset (e.g. mito, ribo and hemo) expression per cell.
runGeneralQC(
object,
organism,
features = NULL,
pattern = NULL,
overwrite = FALSE,
useDatasets = NULL,
chunkSize = getOption("ligerChunkSize", 20000),
verbose = getOption("ligerVerbose", TRUE),
mito = NULL,
ribo = NULL,
hemo = NULL
)
Updated object
with the cellMeta(object)
updated as
intended by users. See Details for more information.
liger object with rawData
available in
each ligerDataset embedded
Specify the organism of the dataset to identify the
mitochondrial, ribosomal and hemoglobin genes. Available options are
"mouse"
, "human"
, "zebrafish"
, "rat"
and
"drosophila"
. Set NULL
to disable mito, ribo and hemo
calculation.
Feature names matching the feature subsets that users want to
calculate the expression percentage with. A vector for a single subset, or a
named list for multiple subset. Default NULL
.
Regex patterns for matching the feature subsets that users
want to calculate the expression percentage with. A vector for a single
subset, or a named list for multiple subset. Default NULL
.
Whether to overwrite existing QC metric variables. Default
FALSE
do not update existing result. Use TRUE
for updating all.
Use a character vector to specify which to update. See Details.
A character vector of the names, a numeric or logical
vector of the index of the datasets to be included for QC. Default
NULL
performs QC on all datasets.
Integer number of cells to include in a chunk when working
on HDF5 based dataset. Default 20000
Logical. Whether to show information of the progress. Default
getOption("ligerVerbose")
or TRUE
if users have not set.
Now will always
compute the percentages of mitochondrial, ribosomal and hemoglobin gene
counts. These arguments will be ignored.
This function by default calculates:
nUMI
- The column sum of the raw data matrix per cell.
Represents the total number of UMIs per cell if given raw counts.
nGene
- Number of detected features per cell
mito
- Percentage of mitochondrial gene expression per cell
ribo
- Percentage of ribosomal gene expression per cell
hemo
- Percentage of hemoglobin gene expression per cell
Users can also specify their own feature subsets with argument
features
, or regular expression patterns that match to genes of
interests with argument pattern
, to calculate the expression
percentage. If a character vector is given to features
, a QC metric
variable named "featureSubset_name"
will be computed. If a named list
of multiple subsets is given, the names will be used as the variable names.
If a single pattern is given to pattern
, a QC metric variable named
"featureSubset_pattern"
will be computed. If a named list of multiple
patterns is given, the names will be used as the variable names.
Duplicated QC metric names between these two arguments and the default
five listed above should be avoided.
This function is automatically operated at the creation time of each
liger object to capture the raw status. Argument
overwrite
is set to FALSE by default to avoid mistakenly updating
existing metrics after filtering the object. Users can still opt to update
all newly calculated metrics (including the default five) by setting
overwrite = TRUE
, or only some of newly calculated ones by providing
a character vector of the names of the metrics to update. Intended
overwriting only happens to datasets selected with useDatasets
.
pbmc <- runGeneralQC(pbmc, "human", overwrite = TRUE)
Run the code above in your browser using DataLab