The defineClonesScoper
function provides an unsupervised pipline for assigning Ig sequences into
clonal groups sharing same V gene, J gene, and junction length.
defineClonesScoper(db, junction = "JUNCTION", v_call = "V_CALL",
j_call = "J_CALL", first = FALSE, cdr3 = FALSE, mod3 = FALSE,
iter_max = 1000, nstart = 25, nproc = 1, progress = FALSE,
out_name = NULL, out_dir = ".")
data.frame with Change-O style columns containing sequence data.
name of the column containing nucleotide sequences to compare. Also used to determine sequence length for grouping.
name of the column containing the V-segment allele calls.
name of the column containing the J-segment allele calls.
if TRUE
only the first call of the gene assignments
is used. if FALSE
the union of ambiguous gene
assignments is used to group all sequences with any
overlapping gene calls.
if TRUE
remove 3 nts from both ends of junction
(converts IMGT junction to CDR3 region). if TRUE
remove junction
(s)
with length less than 7 nts.
if TRUE
remove junction
(s) with number of nucleotides not modulus of 3.
the maximum number of iterations allowed for kmean clustering step.
the number of random sets chosen for kmean clustering initialization.
number of cores to distribute the function over.
if TRUE
print a progress bar.
if not NULL
save cloned data.frame and a summary of cloning
performance. out_name
string is used as the prefix of the successfully
processed output files.
specify to change the output directory. The input file
directory is used if this is not specified while out_name
is specified.
Returns a modified db
data.frame with clone identifiers in the CLONE
column.
if out_name
is not NULL
, it will save the modified db
and a summary
of cloning performance in the current directory or the specified out_dir
.
An unsupervised pipeline to identify B cell clones from adaptive immune receptor repertoire sequencing (AIRR-Seq) datasets. This method is based on spectral clustering of the junction sequences of B cell receptors (BCRs, also referred to as Immunoglobulins, (Igs)) that share the same V gene, J gene and junction length. It uses an adaptive threshold that analyzes sequences in a local neighborhood.
To assess the performance of clonal assignment process check analyzeClones
.
# NOT RUN {
# clone data using defineClonesScoper function
db <- defineClonesScoper(ExampleDb, junction = "JUNCTION", v_call = "V_CALL",
j_call = "J_CALL", first = TRUE)
# }
Run the code above in your browser using DataLab