ExpectedValKmerNUC_DNA: Expected Value for K-mer Nucleotide (ExpectedValKmerNUC_DNA)
Description
This function is introduced by this package for the first time.
It computes the expected value for each k-mer in a sequence.
ExpectedValue(k-mer) = freq(k-mer) / ( freq(nucleotide1) * freq(nucleotide2) * ... * freq(nucleotidek) )
is a FASTA file containing nucleotide sequences. The sequences start
with '>'. Also, seqs could be a string vector. Each element of the vector is a nucleotide sequence.
k
is an integer value. The default is four.
ORF
(Open Reading Frame) is a logical parameter. If it is set to true, ORF region of each sequence is considered instead of the original sequence (i.e., 3-frame).
reverseORF
is a logical parameter. It is enabled only if ORF is true.
If reverseORF is true, ORF region will be searched in the sequence and also in the reverse complement of the sequence (i.e., 6-frame).
normalized
is a logical parameter. When it is FALSE, the return value of the function does not change. Otherwise, the return value is normalized using the length of the sequence.
label
is an optional parameter. It is a vector whose length is equivalent to the number of sequences. It shows the class of
each entry (i.e., sequence).
Value
The function returns a feature matrix. The number of rows is equal to the number of sequences and
the number of columns is (4^k).
# NOT RUN {fileLNC<-system.file("extdata/Athaliana_LNCRNA.fa",package="ftrCOOL")
mat<-ExpectedValKmerNUC_DNA(seqs=fileLNC,k=4,ORF=TRUE,reverseORF=FALSE)
# }