Learn R Programming

ftrCOOL (version 0.1.0)

novel_PseKNC: Novel Pseudo k Nucleotide Composition (series)

Description

This function replaces nucleotides with a four-length vector. The first three elements represent the nucleotides and the forth holds the frequency of the nucleotide from the beginning of the sequence until the position of the nucleotide in the sequence. 'A' will be replaced with c(1, 1, 0, freq), 'C' with c(0, 1, 1, freq),'G' with c(1, 0, 1, freq), and 'T' with c(0, 0, 0, freq).

Usage

novel_PseKNC(seqs, outFormat = "mat", outputFileDist = "", label = c())

Arguments

seqs

is a FASTA file containing nucleotide sequences. The sequences start with '>'. Also, seqs could be a string vector. Each element of the vector is a nucleotide sequence.

outFormat

(output format) can take two values: 'mat'(matrix) and 'txt'. The default value is 'mat'.

outputFileDist

shows the path and name of the 'txt' output file.

label

is an optional parameter. It is a vector whose length is equivalent to the number of sequences. It shows the class of each entry (i.e., sequence).

Value

A feature matrix. The number of rows is equal to the number of sequences.

References

Feng, Pengmian, et al. "iDNA6mA-PseKNC: Identifying DNA N6-methyladenosine sites by incorporating nucleotide physicochemical properties into PseKNC." Genomics 111.1 (2019): 96-102.

Examples

Run this code
# NOT RUN {
dir = tempdir()
LNCSeqsADR<-system.file("extdata/",package="ftrCOOL")
LNC50Nuc<-as.vector(read.csv(paste0(LNCSeqsADR,"/LNC50Nuc.csv"))[,2])
mat<-novel_PseKNC(seqs = LNC50Nuc,outFormat="mat")

ad<-paste0(dir,"/ENUCcompos.txt")
fileLNC<-system.file("extdata/Athaliana_LNCRNA.fa",package="ftrCOOL")
novel_PseKNC(seqs = fileLNC,outFormat="txt",outputFileDist=ad)
# }

Run the code above in your browser using DataLab