Learn R Programming

NAIR: Network Analysis of Immune Repertoire

NAIR is an R package for analyzing the adaptive immune repertoire using network analysis based on similarities among receptor sequences. It implements methods from the following paper:

Hai Yang, Jason Cham, Brian Neal, Zenghua Fan, Tao He and Li Zhang. (2023). NAIR: Network Analysis of Immune Repertoire. Frontiers in Immunology, vol. 14. https://doi.org/10.3389/fimmu.2023.1181825

NAIR allows users to perform network analysis on Adaptive Immune Receptor Repertoire Sequencing (AIRR-Seq) data, including computing local and global network properties of nodes and clusters, which can provide insights into the structural organization of the immune repertoire network.

NAIR also enables users to search across multiple AIRR-Seq samples for clones/clusters associated with subject characteristics, disease conditions or clinical outcomes, as well as identify public clones/clusters. This can help researchers identify potentially important TCR/BCR clones.

To aid in interpretation of the immune repertoire network, NAIR includes convenient functionality for generating customized network visualizations.

What data does NAIR support?

NAIR supports bulk and single-cell immune repertoire sequence data for T-cell or B-cell receptors (TCR or BCR).

  • Single-cell data: Each row is a single cell
  • Bulk data: Each row is a distinct TCR/BCR clone (unique combination of V-D-J genes and nucleotide sequence) and typically includes a corresponding measurement of clonal abundance (e.g., clone count and clone frequency/fraction)

How does NAIR model the immune repertoire as a network?

  • Each cell (single-cell data) or clone (bulk data) is modeled as a node (vertex) in the network
  • For each node, we consider the corresponding receptor sequence (nucleotide or amino acid)
  • For each pair of nodes, we measure the similarity in their receptor sequences (using the Hamming or Levenshtein distance)
  • An edge is drawn between two nodes if the distance is below a specified threshold
    • For single-cell data, sequences from two chains (e.g., alpha chain and beta chain) can be jointly used to determine similarity between cells, considering cells as similar when the sequences for both chains are similar (i.e., when the distance for each chain is below the threshold)
  • Clustering analysis is used to partition the network graph into clusters (densely-connected subgraphs)
    • Many clustering algorithms are available, with each seeking to identify the “best” configuration of clusters according to different graph criteria
  • Network statistics characterize the repertoire in terms of the local and global structural properties of its graph
  • Customized visual plots of the network graph are generated, with nodes colored according to desired metadata (e.g., disease status, sample, cluster, clonal abundance, etc.)

Installation

To install the latest release version of NAIR, use the following command:

install.packages("NAIR")

To install the latest development version of NAIR from source (which requires compilation), use the following command:

devtools::install_github(
  "mlizhangx/Network-Analysis-for-Repertoire-Sequencing-",
  dependencies = TRUE, 
  build_vignettes = TRUE
)

Getting Started

Main Function

General network analysis on AIRR-Seq data is performed using buildRepSeqNetwork() or its convenient alias buildNet(). This function does the following:

  • Filters the AIRR-Seq data according to user specifications
  • Builds the network graph for the immune repertoire
  • Performs additional network analysis, which can include:
    • Cluster analysis
    • Network properties
    • Customizable visual plots of the network graph
  • Returns (and optionally saves) the following output:
    • The network graph (as igraph and adjacency matrix)
    • Metadata for the network
    • Metadata for the nodes in the network
    • Metadata for the clusters in the network
    • Plots of the network graph

See this vignette for a tutorial.

Searching for Associated Clusters

Given multiple samples of bulk AIRR-Seq data, NAIR can be used to search for TCR/BCR clusters associated with a binary variable of interest, such as a disease condition, treatment or clinical outcome. See this article for a tutorial.

Searching for Public Clusters

The NAIR package includes a set of functions that facilitate searching for public TCR/BCR clusters across multiple samples of bulk AIRR-seq data. In this context, a public cluster consists of similar TCR/BCR clones (e.g., those whose CDR3 amino acid sequences differ by at most one amino acid) that are shared across samples (e.g., across individuals or across time points for a single individual). See this article for a tutorial.

Additional Resources

Visualization

This article provides an introduction to the creation and customization of network visualizations using NAIR.

Network Properties and Cluster Analysis

This vignette provides an introduction to computing node-level network properties with NAIR.

This vignette explains how to perform cluster analysis with NAIR.

Supplementary Functions

This vignette provides an overview of NAIR utility functions that supplement the main function buildNet().

Copy Link

Version

Install

install.packages('NAIR')

Monthly Downloads

149

Version

1.0.4

License

GPL (>= 3)

Issues

Pull Requests

Stars

Forks

Maintainer

Brian Neal

Last Published

March 3rd, 2024

Functions in NAIR (1.0.4)

addClusterMembership

Partition a Network Graph Into Clusters
aggregateIdenticalClones

Aggregate Counts/Frequencies for Clones With Identical Receptor Sequences
labelNodes

Label Nodes in a Network Graph Plot
addNodeStats

Compute Node-Level Network Properties
addNodeNetworkStats

Compute Node-Level Network Properties
labelClusters

Label Clusters in a Network Graph Plot
addClusterStats

Compute Cluster-Level Network Properties
NAIR-package

NAIR: Network Analysis of Immune Repertoire
buildAssociatedClusterNetwork

Build Global Network of Associated TCR/BCR Clusters
addPlots

Generate Plots of a Network Graph
extractLayout

Get Coordinate Layout From Graph Plot
combineSamples

Load and Combine Data From Multiple Samples
findAssociatedSeqs

Identify TCR/BCR Sequences Associated With a Binary Variable
filterInputData

Filter Data Rows and Subset Data Columns
chooseNodeStats

Specify Node-level Network Properties to Compute
buildRepSeqNetwork

Network Analysis of Immune Repertoire
buildPublicClusterNetworkByRepresentative

Build Global Network of Public TCR/BCR Clusters Using Representative Clones
buildPublicClusterNetwork

Build Global Network of Public TCR/BCR Clusters
findPublicClusters

Find Public Clusters Among RepSeq Samples
findAssociatedClones

Identify TCR/BCR Clones in a Neighborhood Around Each Associated Sequence
levDistBounded

Bounded Computation of Levenshtein Distance
getNeighborhood

Identify Cells or Clones in a Neighborhood Around a Target Sequence
getClusterStats

Compute Cluster-Level Network Properties
saveNetwork

Save List of Network Objects
hamDistBounded

Bounded Computation of Hamming Distance
plotNetworkGraph

Plot the Graph of an Immune Repertoire Network
generateNetworkGraph

Generate the igraph for a Network Adjacency Matrix
generateNetworkObjects

Generate Basic Output for an Immune Repertoire Network
saveNetworkPlots

Write Plots to a PDF
generateAdjacencyMatrix

Compute Graph Adjacency Matrix for Immune Repertoire Network
simulateToyData

Generate Toy AIRR-Seq Data