Learn R Programming

immunarch (version 0.9.1)

seqCluster: Function for assigning clusters based on sequences similarity

Description

Graph clustering based on distances between sequences

Usage

seqCluster(.data, .dist, .perc_similarity, .nt_similarity, .fixed_threshold)

Value

Immdata data format object. Same as .data, but with extra 'Cluster' column with clusters assigned.

Arguments

.data

The data which was used to caluculate .dist object. Can be data.frame, data.table, or a list of these objects.

Every object must have columns in the immunarch compatible format immunarch_data_format

.dist

List of distance objects produced with seqDist function.

.perc_similarity

Numeric value between 0 and 1 specifying the maximum acceptable weight of an edge in a graph. This threshold depends on the length of sequences.

.nt_similarity

Numeric between 0-sequence length specifying the threshold of allowing a 1 in n nucleotides mismatch in sequencies.

.fixed_threshold

Numeric specifying the threshold on the maximum weight of an edge in a graph.

Examples

Run this code

data(immdata)
# In this example, we will use only 2 samples with 500 clonotypes in each for time saving
input_data <- lapply(immdata$data[1:2], head, 500)
dist_result <- seqDist(input_data)
cluster_result <- seqCluster(input_data, dist_result, .fixed_threshold = 1)

Run the code above in your browser using DataLab