Learn R Programming

ogrdbstats (version 0.5.4)

read_input_files: Read input files into memory

Description

Read input files into memory

Usage

read_input_files(
  ref_filename,
  inferred_filename,
  species,
  filename,
  chain,
  hap_gene,
  segment,
  chain_type,
  all_inferred
)

Value

A named list containing the following elements:

ref_genesnamed list of IMGT-gapped reference genes
inferred_seqsnamed list of IMGT-gapped inferred (novel) sequences.
input_sequencesdata frame with one row per annotated read, with CHANGEO-style column names One key point: the column SEG_CALL is the gene call for the segment under analysis. Hence if segment is 'V', 'V_CALL' will be renamed 'SEG_CALL' whereas is segment is 'J', 'J_CALL' is renamed 'SEG_CALL'. This simplifies downstream processing. Rows in the input file with ambiguous SEG_CALLs, or no call, are removed.
genotype_dbnamed list of gene sequences referenced in the annotated reads (both reference and novel sequences)
haplo_detailsdata used for haplotype analysis, showing allelic ratios calculated with various potential haplotyping genes
genotypedata frame containing information provided in the OGRDB genotype csv file
calculated_NCa boolean that is TRUE if mutation counts were calculated by this library, FALSE if they were read from the annotated read file

Arguments

ref_filename

Name of file containing IMGT-aligned reference genes in FASTA format

inferred_filename

Name of file containing sequences of inferred novel alleles, or '-' if none

species

Species name used in field 3 of the IMGT germline header with spaces omitted, if the reference file is from IMGT. Otherwise ''

filename

Name of file containing annotated reads in AIRR, CHANGEO or IgDiscover format. The format is detected automatically

chain

one of IGHV, IGKV, IGLV, IGHD, IGHJ, IGKJ, IGLJ, TRAV, TRAj, TRBV, TRBD, TRBJ, TRGV, TRGj, TRDV, TRDD, TRDJ

hap_gene

The haplotyping columns will be completed based on the usage of the two most frequent alleles of this gene. If NA, the column will be blank

segment

one of V, D, J

chain_type

one of H, L

all_inferred

Treat all alleles as novel

Examples

Run this code
# Create the analysis data set from example files provided with the package
#(this dataset is also provided in the package as example_rep)
reference_set = system.file("extdata/ref_gapped.fasta", package = "ogrdbstats")
inferred_set = system.file("extdata/novel_gapped.fasta", package = "ogrdbstats")
repertoire = system.file("extdata/ogrdbstats_example_repertoire.tsv", package = "ogrdbstats")

example_data = read_input_files(reference_set, inferred_set, 'Homosapiens',
       repertoire, 'IGHV', NA, 'V', 'H', FALSE)

Run the code above in your browser using DataLab