Minimal version of the VDJ building part from VDJ_GEX_matrix() function. Adapted for Cell Ranger v7 and older versions as well. Currently, Seurat objects need to be integrated by matching barcodes from the Seurat object's metadata with the barcodes of the VDJ dataframe. Authors: Valentijn Tromp, Tudor-Stefan Cotet, Victor Kreiner, Aurora Desideri Perea, Evgenios Kladis, Anamay Samant
VDJ_build(
VDJ.directory,
VDJ.sample.list,
remove.divergent.cells,
complete.cells.only,
trim.germlines,
gap.opening.cost,
parallel,
num.cores
)
Returns the VDJ dataframe / VGM[[1]] object.
string - path to parent directory containing the output folders (one folder for each sample) of Cell Ranger. This pipeline assumes that the output file names have not been changed from the default 10x settings in the /outs/ folder. This is compatible with B and T cell repertoires. ! Neccessary 5 files within this folder: 'filtered_contig_annotations.csv', 'filtered_contig.fasta', 'consensus_annotations.csv', 'consensus.fasta', and 'concat_ref.fasta'.
list - list of paths to the output folders (one folder for each sample) of Cell Ranger. This pipeline assumes that the output file names have not been changed from the default 10x settings in the /outs/ folder. This is compatible with B and T cell repertoires. ! Neccessary 5 files within this folder: 'filtered_contig_annotations.csv', 'filtered_contig.fasta', 'consensus_annotations.csv', 'consensus.fasta', and 'concat_ref.fasta'.
bool - if TRUE, cells with more than one VDJ transcript or more than one VJ transcript will be excluded. This could be due to multiple cells being trapped in one droplet or due to light chain dual expression (concerns ~2-5% of B cells, see DOI:10.1084/jem.181.3.1245). Defaults to FALSE.
bool - if TRUE, only cells with both a VDJ transcripts and a VJ transcript are included in the VDJ dataframe. Keeping only cells with 1 VDJ and 1 VJ transcript could be preferable for downstream analysis. Defaults to FALSE.
bool - if TRUE, the raw germline sequences of each clone will be trimmed using the the consensus sequences of that clone as reference seqeunces (using BIostrings::pairwiseAlignment with the option "global-local" and a gap opening cost = gap.opening.cost). Defaults to FALSE.
float or Inf - the cost for opening a gap in Biostrings::pairwiseAlignment when aligning and trimming germline sequences. Defaults to Inf (gapless alignment).
bool - if TRUE, the per-sample VDJ building is executed in parallel (parallelized across samples). Defaults to FALSE.
integer - number of cores to be used when parallel = TRUE. Defaults to all available cores - 1 or the number of sample folders in 'VDJ.directory' (depending which number is smaller).
# \donttest{
try({
VDJ <- VDJ_build(VDJ_directory)
})
# }
Run the code above in your browser using DataLab