aaDistribution: Amino acid distribution of sequences
Description
This function calculates the amino acid distribution of sequences. Distribution is calculated for sequences of the same length and therein for each position.
aaDistribution returns a list containing either only amino acid distribution or
a list containing amino acid distribution and analyzed number of sequences per length.
plotAADistribution visualizes the amino acid distribution of sequences of the same length.
TRUE: table containing number of sequences will be returned, as well (default: FALSE).
aaDistribution.tab
Output list of function aaDistribution()
plotSeqN
TRUE: Number of sequences for each length will be plotted (see Details; default: FALSE).
colors
Colors to be used for figure containing number of sequences (default: rainbow)
PDF
PDF project name (see Details)
...
Value
Output is a list containing
Amino_acid_distributionlist contains data frames of amino acid distributions (including stop codons "*") for each length
Number_of_sequences_per_lengthdata frame contains the number of sequences for each length, used for analysis (optional)
Details
The vector containing sequences will be divided in sequences of the same length and then amino acid distribution for each position is analyzed.
If numberSeq = T, the number of sequences used for the analysis of sequences of the same length will be returned, as well. This information is also required for plotAADistribution(...,
plotSeqN = T). Sequence numbers equal to 0 are not plotted; the smallest number is 1.
The PDF character string should be only the project name (without ".pdf"). If plotAADistr = T a figure called "PDF"_Amino-acid-distribution.pdf will be saved to the working directory. If plotSeqN = T a figure called "PDF"_Number-of-sequences.pdf will be saved, as well.