seq_file is a sequence table as output by the 'dada2'
pipeline, which has samples in rows and nucleotide sequence variants in
columns. Optionally, a fasta file can be supplied as input in the format
rendered by e.g. read.fasta() from the package 'seqinr'.
path_out
is a user defined path to the folder where the output files
will be saved.
aa_pdist
optional, a logical (TRUE/FALSE) that determines whether
nucleotide sequences should be translated to amino acid sequences before
p-distance calculation, default is NULL/FALSE.
codon_pos
optional, a vector of codon positions to include in
p-distance calculations, if this argument is omitted, p-distance
calculations are made using all codons.
input_fasta
optional, a logical (TRUE/FALSE) that indicates whether
the input file is a fasta file (TRUE) or a dada2 sequence table
(NULL/FALSE), default is NULL/FALSE.
Value
The function returns a matrix with p-distances of all pairwise
sequence comparisons. This table is saved as a .csv file in the output path.
If a fasta file is used as input, only the p-distance matrix will be
produced. If a sequence table is given as input file, the function
additionally returns a table with the mean p-distance for each sample. If a
sequence table is given as input file, the sequences are named in the
output matrix by an index number corresponding to their column number in
the sequence table.
See Also
For more information about 'dada2'visit
<https://benjjneb.github.io/dada2>