field_id: Data subset identifier, defined with the input parameter fields.
A variable number of columns, specified with the input parameter fields.
polymorphism_call: The novel allele call.
novel_imgt: The novel allele sequence.
closest_reference: The closest reference gene and allele in
the germline_db database.
closest_reference_imgt: Sequence of the closest reference gene and
allele in the germline_db database.
germline_call: The input (uncorrected) V call.
germline_imgt: Germline sequence for germline_call.
nt_diff: Number of nucleotides that differ between the new allele and
the closest reference (closest_reference) in the germline_db database.
nt_substitutions: A comma separated list of specific nucleotide
differences (e.g. 112G>A) in the novel allele.
aa_diff: Number of amino acids that differ between the new allele and the closest
reference (closest_reference) in the germline_db database.
aa_substitutions: A comma separated list with specific amino acid
differences (e.g. 96A>N) in the novel allele.
sequences: Number of sequences unambiguously assigned to this allele.
unmutated_sequences: Number of records with the unmutated novel allele sequence.
unmutated_frequency: Proportion of records with the unmutated novel allele
sequence (unmutated_sequences / sequences).
allelic_percentage: Percentage at which the (unmutated) allele is observed
in the sequence dataset compared to other (unmutated) alleles.
unique_js: Number of unique J sequences found associated with the
novel allele. The sequences are those who have been unambiguously assigned
to the novel allele (polymorphism_call).
unique_cdr3s: Number of unique CDR3s associated with the inferred allele.
The sequences are those who have been unambiguously assigned to the
novel allele (polymorphism_call).
mut_min: Minimum mutation considered by the algorithm.
mut_max: Maximum mutation considered by the algorithm.
pos_min: First position of the sequence considered by the algorithm (IMGT numbering).
pos_max: Last position of the sequence considered by the algorithm (IMGT numbering).
y_intercept: The y-intercept above which positions were considered
potentially polymorphic.
alpha: Significance threshold to be used when constructing the
confidence interval for the y-intercept.
min_seqs: Input min_seqs. The minimum number of total sequences
(within the desired mutational range and nucleotide range) required
for the samples to be considered.
j_max: Input j_max. The maximum fraction of sequences perfectly
aligning to a potential novel allele that are allowed to utilize to a particular
combination of junction length and J gene.
min_frac: Input min_frac. The minimum fraction of sequences that must
have usable nucleotides in a given position for that position to be considered.
note: Comments regarding the novel allele inference.