Learn R Programming

EncDNA (version 1.0.2)

Encoding of Nucleotide Sequences into Numeric Feature Vectors

Description

We describe fifteen different splice site sequence encoding schemes that have been used in earlier studies for mapping of splice site sequences into numeric feature vectors. These encoding schemes will also be helpful for transforming other nucleotide sequences into numeric forms, provided they are of equal length. These encoding schemes will help the computational biologist working in the field of classification (binary or multiclass) or prediction involving nucleic acid sequences of equal length.

Copy Link

Version

Install

install.packages('EncDNA')

Monthly Downloads

181

Version

1.0.2

License

GPL (>= 2)

Maintainer

Prabina Meher

Last Published

May 28th, 2019

Functions in EncDNA (1.0.2)

PN.Fdtf.Feature

Conversion of nucleotide sequences into numeric feature vectors based on the difference of dinucleotide frequency.
Sparse.Feature

Nucleotide sequence encoding with 0 and 1.
SAE.Feature

Encoding of nucleotide sequences based on sum of absolute error (SAE) of each sequence.
Density.Feature

Nucleotide sequence encoding with the distribution of trinucleotides.
MM1.Feature

Transforming nucleotide sequences into numeric vectors using first order nucleotide dependency.
Trint.Dist.Feature

Tri-nucleotide distribution-based encoding of nucleotide sequences.
Bayes.Feature

Projecting nucleotide sequences into numeric feature vectors using Bayes kernel encoding approach.
WAM.Feature

Nucleic acid sequence encoding based on weighted array model.
APR.Feature

Adjacent position relationship feature.
WMM.Feature

Weighted matrix model based mapping of nucleotide sequences into vectors of numeric observations.
droso

An example dataset consisting of true and false donor splice sites of Drosophila melanogaster.
MN.Fdtf.Feature

Sequence encoding with nucleotide frequency difference between two classes of sequence datasets.
MM2.Feature

Mapping nucleotide sequences onto numeric feature vectors based on second order nucleotide dependencies.
POS.Feature

Transformation of nucleic acid sequences into numeric vectors using position-wise frequency of nucleotides.
Maldoss.Feature

Encoding of nucleic acid sequences using di-nucleotide frequency difference between positive and negative class datasets.
Predoss.Feature

Encoding nucleotide sequences using all possible di-nucleotide dependencies.