- databases
list of databases to be processed and saved. Currently supported ones include: VDJdb(='vdjdb'), McPAS-TCR(='mcpas'), TBAdb(='tbdadb_tcr' or 'tbadb_bcr').
- file.paths
list of file paths for the specified databases (in the database parameter). If NULL, will try to locally download the databases from the archived download links.
- preprocess
boolean - if T, will preprocess each database individually.
- species
string - either 'Human' or 'Mouse', the species for the processed database. Needs preprocess=T.
- filter.sequences
string - 'VDJ' to remove rows with NA VDJ sequences, 'VJ' to remove rows with NA VJ sequences, 'VDJ.VJ' to remove rows with both VDJ and VJ sequences missing. Needs preprocess=T.
- remove.na
string or NULL - 'all' will remove all rows with missing values from the database, 'common' will remove only rows with missing values for the shared columns among all databases ('VJ_cdr3s_aa','VDJ_cdr3s_aa','Species','Epitope','Antigen species'), 'vgm' will remove missing values for columns shared with the VDJ object (specific to each database). Needs preprocess=T.
- vgm.names
boolean - if T, will change all column names of the shared columns (with VDJ) to match those from VDJ. Use this to integrate the antigen data into VDJ using VDJ_antigen_integrate or VDJ_db_annotate. Needs preprocess=T.
- keep.only.common
boolean - if T, will only keep the columns shared between all databases ('VJ_cdr3s_aa','VDJ_cdr3s_aa','Species','Epitope','Antigen species') for each processed database. Needs preprocess=T.
- output.format
string - 'df.list' to save all databases as a list, 'save' to save them as csv files.
- saving.path
string - directory where the processed databases should be locally saved if output.format='save'.