This function allows users to get a list of clonotypes that are shared
between clusters based on the levels of the active cell identities / some
custom identity based on the alt_ident
. A list is returned with its
names being the shared clonotypes, and the values are numeric vectors
indicating the index of the clusters that clonotype is found in. The index
corresponds to the index in the default levels of the factored identities.
If run_id
is inputted, then the function will attempt to get the shared
clonotypes from the corresponding APackOfTheClones run generated from
RunAPOTC()
. Otherwise, it will use the filtering / subsetting parameters
to generate the shared clones.
getSharedClones(
seurat_obj,
reduction_base = "umap",
clonecall = "strict",
...,
extra_filter = NULL,
alt_ident = NULL,
run_id = NULL,
top = NULL,
top_per_cl = NULL,
intop = NULL,
intop_per_cl = NULL,
publicity = c(2L, Inf)
)
a named list where each name is a clonotype, each element is a numeric indicating which seurat cluster(s) its in, in no particular order. If no shared clones are present, the output is an empty list.
Seurat object with one or more dimension reductions and
already have been integrated with a TCR/BCR library with
scRepertoire::combineExpression
.
character. The seurat reduction to base the clonal
expansion plotting on. Defaults to 'umap'
but can be any reduction present
within the reductions slot of the input seurat object, including custom ones.
If `'pca'``, the cluster coordinates will be based on PC1 and PC2.
However, generally APackOfTheClones is used for displaying UMAP and
occasionally t-SNE versions to intuitively highlight clonal expansion.
character. The column name in the seurat object metadata to
use. See scRepertoire
documentation for more information about this
parameter that is central to both packages.
additional "subsetting" keyword arguments indicating the rows
corresponding to elements in the seurat object metadata that should be
filtered by. E.g., seurat_clusters = c(1, 9, 10)
will filter the cells to
those in the seurat_clusters
column with any of the values 1, 9, and 10.
Unfortunately, column names in the seurat object metadata cannot
conflict with the keyword arguments. MAJOR NOTE if any subsetting
keyword arguments are a prefix of any preceding argument names (e.g. a
column named reduction
is a prefix of the reduction_base
argument)
R will interpret it as the same argument unless both arguments
are named. Additionally, this means any subsequent arguments must be named.
character. An additional string that should be formatted
exactly like a statement one would pass into dplyr::filter that does
additional filtering to cells in the seurat object - on top of the other
keyword arguments - based on the metadata. This means that it will be
logically AND'ed with any keyword argument filters. This is a more flexible
alternative / addition to the filtering keyword arguments. For example, if
one wanted to filter by the length of the amino acid sequence of TCRs, one
could pass in something like extra_filter = "nchar(CTaa) - 1 > 10"
. When
involving characters, ensure to enclose with single quotes.
character. By default, cluster identity is assumed to be
whatever is in Idents(seurat_obj)
, and clones will be grouped by the active
ident. However, alt_ident
could be set as the name of some column in the
meta data of the seurat object to be grouped by. This column is meant to have
been a product of Seurat::StashIdent
or manually added.
character. This will be the ID associated with the data of a
run, and will be used by other important functions like APOTCPlot()
and
AdjustAPOTC. Defaults to NULL
, in which case the ID will be generated
in the following format:
reduction_base;clonecall;keyword_arguments;extra_filter
where if keyword arguments and extra_filter are underscore characters if
there was no input for the ...
and extra_filter
parameters.
integer or numeric in (0, 1) - if not null, filters the output
clones so that only the shared clonotypes with counts the top top
count /
proportion (for numeric in (0, 1) input) shared clones are kept. For cases
where several clonotypes tie in size, the clonotype(s) added are not
guaranteed but deterministic given the other arguments are identical.
integer or numeric in (0, 1) - if not null, filters the
output clones so that for each seurat cluster, only the clonotypes with the
top_per_cl
frequency/count is preserved when aggregating shared clones,
in the same way as the above. Note that if inputted in conjunction with
top
, it will get the intersection of the clonotypes filtered each way.
For cases where several clonotypes tie in size, the clonotype(s) added are
not guaranteed but deterministic given the other arguments are identical.
integer or numeric in (0, 1) - if not null, filters the raw
clone sizes before computing the shared clonotypes so that only the
clonotypes that have their overall size in the top intop
largest sizes
(if it is integer, else the intop
proportion) are kept. To emphasize,
this argument does not necessarily return the top
shared clones
and likely a little less, because this filters the raw clone sizes, of
which, its very likely that not all those clones end up being shared.
integer or numeric in (0, 1) - if not null, filters
the raw clustered clone sizes before computing shared clones, so that
for every clone in a seurat cluster, the top intop_per_cl
count /
proportion (for numeric in (0, 1) input) clones are kept.
numeric pair. A simple filter range of
c(lowerbound, upperbound)
to retain only shared clones with their
"publicity" - number of clusters they are present in - within this
range.
data("combined_pbmc")
getSharedClones(combined_pbmc)
getSharedClones(
combined_pbmc,
orig.ident = c("P17B", "P18B"), # a named subsetting parameter
clonecall = "aa"
)
# extract shared clones from a past RunAPOTC run
combined_pbmc <- RunAPOTC(
combined_pbmc, run_id = "foo", verbose = FALSE
)
getSharedClones(
combined_pbmc, run_id = "foo", top = 5
)
# doing a run and then getting the clones works too
combined_pbmc <- RunAPOTC(combined_pbmc, run_id = "run1", verbose = FALSE)
getSharedClones(combined_pbmc, run_id = "run1")
Run the code above in your browser using DataLab