DNAshapeR can be used to generate feature vectors for a user-defined model.
The model can be a k-mer sequence. Sequence is encoded in four binary
features (i.e., in terms of 1-mers, 0001 for adenine, 0010 for cytosine,
0100 for guanine, and 1000 for thymine) at each nucleotide position
(Zhou, et al., 2015). The function permits an encoding of 2-mers and 3-mers
(16 and 64 binary features at each position, respectively).
Usage
encodeKMerSeq(k, dnaStringSet)
Arguments
k
A number indicating k-mer sequence encoding
dnaStringSet
A DNAStringSet object of the inputted fasta file
Value
featureVector A matrix containing encoded features. Sequence
feature is represented as binary numbers