RunAPOTC: Run the APackOfTheClones method on a combined Seurat object for downstream visualization of clonal expansion

Description

Computes necessary information for an APackOfTheClones clonal expansion plot (APOTCPlot()) and stores it in the seurat object. Gets sizes of unique clones and utilizes a circle-packing algorithm to pack circles representing individual clones in approximately the same dimensional reduction (reduction_base) coordinates based on some cell ident (defaults to the active ident).

The parameter extra_filter along with an unlimited number of additional keyword arguments can be used to filter the cells by certain conditions in the metadata, and new results will be stored in addition to other runs the users may have done.

Each APackOfTheClones run is uniquely identified by the parameters reduction_base, clonecall, extra_filter, and any additional keywords passed to filter the metadata. Each distinct run result is stored in the seurat object and has an associated Id generated from the aforementioned parameters. To view the id of the latest run, call getLastApotcDataId. To view all the ids of previous runs, call getApotcDataIds. To work further with a specific run (most importantly, plotting), the user can use this id in the arguments with is slightly more convenient than passing in the original RunAPOTC parameters again but both ways work.

If the user wishes to manually customize/fix the expansion plot generated, the circular packing information can be modified with the AdjustAPOTC function.

Usage

RunAPOTC(
  seurat_obj,
  reduction_base = "umap",
  clonecall = "strict",
  ...,
  extra_filter = NULL,
  alt_ident = NULL,
  run_id = NULL,
  clone_scale_factor = "auto",
  rad_scale_factor = 0.95,
  order_clones = TRUE,
  try_place = FALSE,
  repulse = TRUE,
  repulsion_threshold = 1,
  repulsion_strength = 1,
  max_repulsion_iter = 20L,
  override = FALSE,
  verbose = TRUE
)

Value

A modified version of the input seurat object, which harbors data necessary for visualizing the clonal expansion of the cells with APOTCPlot()

and has a friendly user interface to modify certain attributes with AdjustAPOTC.

Arguments

seurat_obj

Seurat object with one or more dimension reductions and already have been integrated with a TCR/BCR library with scRepertoire::combineExpression.

reduction_base

character. The seurat reduction to base the clonal expansion plotting on. Defaults to 'umap' but can be any reduction present within the reductions slot of the input seurat object, including custom ones. If `'pca'``, the cluster coordinates will be based on PC1 and PC2. However, generally APackOfTheClones is used for displaying UMAP and occasionally t-SNE versions to intuitively highlight clonal expansion.

clonecall

character. The column name in the seurat object metadata to use. See scRepertoire documentation for more information about this parameter that is central to both packages.

...

additional "subsetting" keyword arguments indicating the rows corresponding to elements in the seurat object metadata that should be filtered by. E.g., seurat_clusters = c(1, 9, 10) will filter the cells to those in the seurat_clusters column with any of the values 1, 9, and 10. Unfortunately, column names in the seurat object metadata cannot conflict with the keyword arguments. MAJOR NOTE if any subsetting keyword arguments are a prefix of any preceding argument names (e.g. a column named reduction is a prefix of the reduction_base argument) R will interpret it as the same argument unless both arguments are named. Additionally, this means any subsequent arguments must be named.

extra_filter

character. An additional string that should be formatted exactly like a statement one would pass into dplyr::filter that does additional filtering to cells in the seurat object - on top of the other keyword arguments - based on the metadata. This means that it will be logically AND'ed with any keyword argument filters. This is a more flexible alternative / addition to the filtering keyword arguments. For example, if one wanted to filter by the length of the amino acid sequence of TCRs, one could pass in something like extra_filter = "nchar(CTaa) - 1 > 10". When involving characters, ensure to enclose with single quotes.

alt_ident

character. By default, cluster identity is assumed to be whatever is in Idents(seurat_obj), and clones will be grouped by the active ident. However, alt_ident could be set as the name of some column in the meta data of the seurat object to be grouped by. This column is meant to have been a product of Seurat::StashIdent or manually added.

run_id

character. This will be the ID associated with the data of a run, and will be used by other important functions like APOTCPlot() and AdjustAPOTC. Defaults to NULL, in which case the ID will be generated in the following format:

reduction_base;clonecall;keyword_arguments;extra_filter

where if keyword arguments and extra_filter are underscore characters if there was no input for the ... and extra_filter parameters.

clone_scale_factor

Dictates how much to scale each circle(between 0,1) radius when converting from clonotype counts into circles that represent individual clonotypes. The argument defaults to the character "auto", and if so, the most visually pleasing factor will be estimated.

rad_scale_factor

numeric between 0 and 1. This value decreases the radius of the smallest clones by this scale factor. And the absolute value of this decrease will be applied to all packed circles, effectively shrinking all circles on the spot, and introduce more constant spacing in between.

order_clones

logical. Decides if the largest clone circles should be near cluster centroids. This is highly recommended to be set to TRUE for increased intuitiveness of the visualization, as resulting plots tend to give an improved impression of the proportion of expanded clones. If FALSE, will randomly scramble the positions of each circle. For the sake of being replicable, a random seed is recommended to be set with set.seed.

try_place

If TRUE, always minimizes distance from a newly placed circle to the origin in the circle packing algorithm.

repulse

If TRUE, will attempt to push overlapping clusters away from each other.

repulsion_threshold

numeric. The radius that clonal circle clusters overlap is acceptable when repulsing.

repulsion_strength

numeric. The smaller the value the less the clusters repulse each other per iteration, and vice versa.

max_repulsion_iter

integer. The number of repulsion iterations.

override

logical. If TRUE, will override any existing APackOfTheClones run data with the same run_id.

verbose

logical. Decides if visual cues are displayed to the R console of the progress.

Cluster labelling

For the ident that was used to cluster the clones, labels for each cluster are inferred and stored in the run so that they can be used by other functions and optionally overlaid on the plot over clusters. If the levels of the ident used is a naturally ordered integer sequence, then the labels generated would be "C1", "C2", "C3" ... , else they would be the actual ident levels themselves.

Details

Note that the subsetting arguments ... and extra_filter are only a quick convenience to subset based on metadata, and the subset S3 method defined in Seurat is much more mature are has more features. Additionally, users need to work with data subsets are recommended to and likely already are working with seurat objects subsetted/split with Seurat::SplitObject.

All APackOfTheClones run data is stored in the Seurat object under seurat_object@misc$APackOfTheClones, which is a list of S4 objects of the type "ApotcData", with each element corresponding to a unique run. The id of each run is the name of each element in the list. The user really shouldn't manually modify anything in the list as it may cause unexpected behavior with many other functions.

Additionally, it logs a seurat command associated with the run in the seurat_object@commands slot as a "SeuratCommand" object (from Seurat), where the name of the object in the list is formatted as RunAPOTC.run_id.

Examples

Run this code

data("combined_pbmc")

# this is the recommended approach to use a custom run_id with default params
combined_pbmc <- RunAPOTC(combined_pbmc, run_id = "default", verbose = FALSE)

# here's a seperate run with some filters to the meta data, where
# `orig.ident` is a custom column in the example data. Notice that it is not
# a `RunAPOTC()` parameter but a user keyword argument
combined_pbmc <- RunAPOTC(
    combined_pbmc, run_id = "sample17", orig.ident = c("P17B", "P17L"),
    verbose = FALSE
)

# the exact same thing can be achieved with the `extra_filter` parameter
combined_pbmc <- RunAPOTC(
    combined_pbmc,
    run_id = "sample17",
    extra_filter = "substr(orig.ident, 2, 3) == '17'",
    override = TRUE,
    verbose = FALSE
)

Run the code above in your browser using DataLab