tcR (version 1.1)

shared.repertoire: Shared TCR repertoire managing and analysis

Description

Generate a repertoire of shared sequences - sequences presented in more than one subject. If sequence is appeared more than once in the one repertoire, than only the first appeared one will be choosed for a shared repertoire.

shared.repertoire - make a shared repertoire of sequences from the given list of data frames.

shared.matrix - leave columns, which related to the count of sequences in people, and return them as a matrix. I.e., this functions will remove such columns as 'CDR3.amino.acid.sequence', 'V.segments', 'People'.

Usage

shared.repertoire(.datalist, .type = 'avc', .min.ppl = 1, .head = -1,
                  .clear = T, .verbose = T, .by.col = '', .sum.col = '',
                  .max.ppl = length(.datalist))

shared.matrix(.shared.rep)

Arguments

.datalist
List with data frames.
.type
String of length 3 denotes how to create a shared repertoire. See "Details" for more information. If supplied, than parameters .by.col and .sum.col will be ignored. If not supplied, than columns in .by.col and
.min.ppl
At least how many people must have a sequence to leave this sequence in the shared repertoire.
.head
Parameter for the head function, applied to all data frames before clearing.
.clear
If T than remove all sequences which have symbols "~" or "*" (i.e., out-of-frame sequences for amino acid sequences).
.verbose
If T than output progress.
.by.col
Character vector with names of columns with sequences and their parameters (like segment) for using for creating a shared repertoire.
.sum.col
Character vector of length 1 with names of the column with count, percentage or any other numeric chaaracteristic of sequences for using for creating a shared repertoire.
.max.ppl
At most how many people must have a sequence to leave this sequence in the shared repertoire.
.shared.rep
Shared repertoire.

Value

  • Data.table for shared.repertoire, matrix for shared.matrix.

Details

Parameter .type is a string of length 3, where:
  1. First character stands either for the letter 'a' for taking the "CDR3.amino.acid.sequence" column or for the letter 'n' for taking the "CDR3.nucleotide.sequence" column.
  2. Second character stands whether or not take the V.segments column. Possible values are '0' (zero) stands for taking no additional columns, 'v' stands for taking the "V.segments" column.
  3. Third character stands for name of the column to choose as numeric characteristic of sequences. Possible values are "c" for the "Read.count" column, "p" for the "Percentage" column, "r" for the "Rank" column or "i" for the "Index" column. If "Rank" or "Index" isn't in the given repertoire, than it will be created usingset.rankfunction using default "Read.count" column.

See Also

shared.representation, set.rank

Examples

Run this code
# Set "Rank" column in data by "Read.count" column.
# This is doing automatically in shared.repertoire() function
# if the "Rank" column hasn't been found.
immdata <- set.rank(immdata)
# Generate shared repertoire using "CDR3.amino.acid.sequence" and
# "V.segments" columns and with rank.
imm.shared.av <- shared.repertoire(immdata, 'avr')

Run the code above in your browser using DataLab