Learn R Programming

Delineating inter- and intra-antibody repertoire evolution with AntibodyForests

The generated wealth of immune repertoire sequencing data requires software to investigate and quantify inter- and intra-antibody repertoire evolution to uncover how B cells evolve during immune responses. Here, we present AntibodyForests, a software to investigate and quantify inter- and intra-antibody repertoire evolution.

This R package is currently composed of a pipeline to reconstruct lineage trees from 10x single-cell V(D)J sequencing data preprocessed with the Platypus package and compare trees within and across repertoires. Furthermore, it has modalities to integrate bulk RNA sequencing data, features of protein 3D structure, and evolutionary likelihoods generated with protein language models.

Installation

Both Platypus and AntibodyForests can be installed from CRAN.

#Install from CRAN
install.packages("Platypus")
install.packages("AntibodyForests")

Quick Start

This quick start gives a short use case of AntibodyForests. Single-cell V(D)J sequencing 10x output of five mice immunized with Ovalbumin (OVA) from Neumeier et al. (2022) are used to create a VDJ dataframe with Platypus. AntibodyForests is used to create lineage trees for each B cell clonotype using an MST-like algorithm.

#Load the libraries
library(Platypus)
library(AntibodyForests)

# Import 10x Genomics output files into VDJ dataframe, only keep cells with one VDJ and one VJ transcript, and trim the germline sequences
VDJ_OVA <- VDJ_build(VDJ.directory = "10x_output/VDJ/",
                     remove.divergent.cells = TRUE,
                     complete.cells.only = TRUE,
                     trim.germlines = TRUE)

# Build lineage trees for all clones present in the VDJ dataframe with the default algorithm
AntibodyForests_OVA <- Af_build(VDJ = VDJ_OVA, construction.method = "phylo.network.default")

# Plot one of the lineage trees as an example
Af_plot_tree(AntibodyForests_object = AntibodyForests_OVA, sample = "S1", clonotype = "clonotype3")

Now we cluster the trees in this AntibodyForests object based on the Jensen-Shannon divergence between the Spectral Density profiles. We visualize the results in a heatmap and observe two clusters.

# Cluster the trees that contain at least 8 nodes
out <- Af_compare_within_repertoires(input = AntibodyForests_OVA
                                     min.nodes = 8,
                                     distance.method = "jensen-shannon",
                                     clustering.method = "mediods",
                                     visualization.methods = "heatmap")
# Plot the heatmap
out$plots$heatmap_clusters

When we analyze the difference between the clusters, we observe that trees in cluster 2 have deep branching events indicated by the negative asymmetry index and contain multiple spectral density modalities. This indicates that various events of diversification took place during the evolution of these clonotypes and that cells with a small amount of SHM were recovered.

# Analyze the difference between the clusters
plots <- Af_cluster_metrics(input = AntibodyForests_OVA_default,
                   clusters = out$clustering,
                   metrics = "spectral.density",
                   min.nodes = 8,
                   significance = T)

plots$spectral.asymmetry
plots$modalities

Copy Link

Version

Install

install.packages('AntibodyForests')

Monthly Downloads

296

Version

1.1.0

License

GPL-2

Maintainer

Alexander Yermanos

Last Published

July 17th, 2025

Functions in AntibodyForests (1.1.0)

calculate_GBLD

Calculate the GBLD distance between trees in an AntibodyForests object. Code is derived from https://github.com/tahiri-lab/ClonalTreeClustering/blob/main/src/Python/GBLD_Metric_Final.ipynb Farnia, M., Tahiri, N. New generalized metric based on branch length distance to compare B cell lineage trees. Algorithms Mol Biol 19, 22 (2024). https://doi.org/10.1186/s13015-024-00267-1
af_mst

Small AntibodyForests object with MST algorithm for function testing purposes
compare_repertoire

Example output from Af_compare_within_repertoires() for function testing purposes
VDJ_to_AIRR

Function to convert VDJ dataframe into an AIRR-formatted TSV file.
af_default

Small AntibodyForests object with default algorithm for function testing purposes
VDJ_integrate_bulk

A function to integrate bulk and single cell data
VDJ_import_IgBLAST_annotations

Function to import the annotations and alignments from IgBLAST output into the VDJ dataframe.
Af_edge_RMSD

Function to calculate the RMSD between sequences over each edge in the AntibodyForest object
PLM_dataframe

Small PLM dataframe for function testing purposes
VDJ_3d_properties

Function to calculate 3D-structure propoperties such as the average charge and hydrophobicity, pKa shift, free energy, RMSD of PDB files and add them to an AntibodyForests-object
igraph_to_phylo

Converts an igraph network into a phylogenetic tree as a phylo object.
newick_to_Af

Converts files with phylogenetic trees in newick format into an AntibodyForests object.
af_nj

Small AntibodyForests object with NJ algorithm for function testing purposes
small_af

Small AntibodyForests object for function testing purposes
small_vdj

Small VDJ dataframe for function testing purposes
Af_compare_methods

Function to compare trees created with different algorithms from the same clonotype.
Af_distance_boxplot

Function to make a grouped boxplot of distance between nodes from specific groups and the germline of lineage trees constructed with AntibodyForests.
Af_cluster_node_features

Function to create a barplot of the cluster composition of selected features from each tree in an AntibodyForests-object
Af_cluster_metrics

Function to make a grouped boxplot of metrics from clusters of clonotypes
Af_build

Function to infer B cell evolutionary networks for all clonotypes in VDJ dataframe as obtained from the 'VDJ_build()' function.
Af_add_node_feature

Function to add node features to an AntibodyForests-object
Af_compare_across_repertoires

A function to compare dynamics of B cell evolution across different repertoires.
Af_compare_within_repertoires

Function to compare tree topology of B cell lineages
Af_PLM_dataframe

Function to create a dataframe of the Protein Language Model probabilities and ranks of the mutations along the edges of B cell lineage trees.
Af_compare_PLM

Function to compare the distributions of the Protein Language Model probabilities or ranks of the mutations along the edges of B cell lineage trees across repertoires using the Jensen-Shannon divergence.
Af_sync_nodes

Function to synchronize the node labels/names of all clonotypes within all samples of two AntibodyForests objects.
Af_metrics

Function to calculate metrics for each tree in an AntibodyForests-object
Af_plot_PLM_mut_vs_cons

Function to create a boxplot of the Protein Language Model probabilities
Af_to_newick

Saves an AntibodyForests-object into a newick file
Af_node_size_boxplot

Function to make a grouped boxplot of the normalized average node sizes (number of cells with the exact same sequence) from specific groups of lineage trees constructed with AntibodyForests.
Af_get_sequences

Function to get the sequences from the nodes in an AntibodyForest object
Af_distance_scatterplot

Function to scatterplot the distance to the germline to a numerical node feature of the AntibodyForests-object
Af_plot_PLM

Function to create a distribution plot of the Protein Language Model probabilities and ranks of the mutations along the edges of B cell lineage trees.
Af_plot_tree

Plots lineage tree of clonotype from AntibodyForests object