MHCtools (version 1.2.1)

CalcPdist: CalcPdist() function

Description

CalcPdist calculates p-distances from pairwise sequence comparisons and mean p-distances for each sample in a 'dada2' sequence table.

Usage

CalcPdist(seq_file, path_out, aa_pdist = NULL, codon_pos = NULL,
  input_fasta = NULL)

Arguments

seq_file

seq_file is a sequence table as output by the 'dada2' pipeline, which has samples in rows and nucleotide sequence variants in columns. Optionally, a fasta file can be supplied as input in the format rendered by e.g. read.fasta() from the package 'seqinr'.

path_out

is a user defined path to the folder where the output files will be saved.

aa_pdist

optional, a logical (TRUE/FALSE) that determines whether nucleotide sequences should be translated to amino acid sequences before p-distance calculation, default is NULL/FALSE.

codon_pos

optional, a vector of codon positions to include in p-distance calculations, if this argument is omitted, p-distance calculations are made using all codons.

input_fasta

optional, a logical (TRUE/FALSE) that indicates whether the input file is a fasta file (TRUE) or a dada2 sequence table (NULL/FALSE), default is NULL/FALSE.

Value

The function returns a matrix with p-distances of all pairwise sequence comparisons. This table is saved as a .csv file in the output path. If a fasta file is used as input, only the p-distance matrix will be produced. If a sequence table is given as input file, the function additionally returns a table with the mean p-distance for each sample. If a sequence table is given as input file, the sequences are named in the output matrix by an index number corresponding to their column number in the sequence table.

See Also

For more information about 'dada2'visit <https://benjjneb.github.io/dada2>

Examples

Run this code
# NOT RUN {
seq_file <- sequence_table_fas
path_out <- tempdir()
CalcPdist(seq_file, path_out, aa_pdist=NULL, codon_pos=c(1,2,3,4,5,6,7,8), input_fasta=NULL)
# }

Run the code above in your browser using DataLab