Learn R Programming

immunarch (version 0.10.3)

airr_clonality: Clonality - receptor overabundance statistics for immune repertoires

Description

[Experimental]

A family of functions to quantify receptor overabundance per repertoire. Helps in deciphering the structure and partition the repertoire.

Available functions

Supported methods are the following.

airr_clonality_line - build ranked abundance lines: for each repertoire, take the top limit receptors by count and attach repertoire metadata. Useful for per-repertoire rank-abundance plots.

airr_clonality_rank - aggregate clonal space by rank bins. Receptors are ordered by proportion within each repertoire; each receptor is assigned to the smallest threshold in bins that contains its rank.

airr_clonality_prop - aggregate clonal space by proportion bins. Each receptor is assigned to a named bin according to its proportion (e.g., Hyperexpanded >= 1e-2, Large >= 1e-3, ...). Thresholds are matched in descending order; unmatched receptors fall into "Ultra-rare".

Usage

airr_clonality_line(
  idata,
  limit = 1e+05,
  autojoin = getOption("immundata.autojoin", TRUE),
  format = c("long", "wide")
)

airr_clonality_rank( idata, bins = c(10, 30, 100, 300, 1000, 10000, 1e+05), autojoin = getOption("immundata.autojoin", TRUE), format = c("long", "wide") )

airr_clonality_prop( idata, bins = c(Hyperexpanded = 0.01, Large = 0.001, Medium = 1e-04, Small = 1e-05, Rare = 1e-06), autojoin = getOption("immundata.autojoin", TRUE), format = c("long", "wide") )

Value

airr_clonality_line

A tibble with columns:

  • repertoire_id - repertoire identifier

  • index - rank within repertoire (1 = most abundant)

  • count - receptor count used for ranking

  • plus any repertoire metadata columns carried from idata$repertoires

airr_clonality_rank

A tibble with

  • repertoire_id

  • clonal_rank_bin - the rank threshold (e.g., 10, 100, ...)

  • occupied_prop - sum of proportion within the bin

  • plus repertoire metadata columns from idata$repertoires

airr_clonality_prop

A tibble with

  • repertoire_id

  • clonal_prop_bin - factor-like label from names(bins) or "Ultra-rare"

  • occupied_prop - sum of proportion within the bin

  • plus repertoire metadata columns from idata$repertoires

Arguments

idata

An ImmunData object.

limit

Positive integer >= 10: maximum number of top receptors to keep per repertoire (default 100000).

autojoin

Logical. If TRUE, join repertoire metadata by the schema repertoire id. Change the default behaviour by calling options(immunarch.autojoin = FALSE).

format

String. One of "long" ("long" tibble with imd_repertoire_id, facet columns, and value; useful for visualizations) or "wide" (wide/unmelted table of features, with each row corresponding to a specific repertoire / pair of repertoires; useful for Machine Learning).

bins

A named numeric vector of thresholds (e.g., c(Hyperexpanded = 1e-2, Large = 1e-3, ...)). Names become bin labels and must be non-empty. Internally sorted in descending order.

See Also

  • Per-repertoire summaries: annotate_clonality

  • Data container: immundata::ImmunData

Examples

Run this code
# Limit the number of threads used by the underlying DB for this session.
# Change this only if you know what you're doing (e.g., multi-user machines, shared CI/servers).
db_exec("SET threads TO 1")

# Load data
if (FALSE) {
immdata <- get_test_idata() |> agg_repertoires("Therapy")
}

#
# airr_clonality_line
#
if (FALSE) {
top_line <- airr_clonality_line(immdata, limit = 1000)
}

#
# airr_clonality_rank
#
if (FALSE) {
rank_stat <- airr_clonality_rank(immdata, bins = c(10, 100))
}

#
# airr_clonality_prop
#
if (FALSE) {
prop_stat <- airr_clonality_prop(immdata)
}

Run the code above in your browser using DataLab