Sparse.Feature

Sequence dataset to be encoded into numeric vector containing 0 and 1, must be an object of class <code><a rd-options="" href="/link/DNAStringSet?package=EncDNA&version=1.0.2" data-mini-rdoc="EncDNA::DNAStringSet">DNAStringSet</a></code>.

test_seq

In this encoding approach A, T, G and C are encoded as (1,1,1), (1,0,0), (0,1,0) and (0,0,1). This was introduced by Golam Bari et al. (2014). Besides, each nucleotide can also be encoded with four bits i.e., A as (1,0,0,0), T as (0,1,0,0), G as (0,0,1,0) and C as (0,0,0,1) as followed in Meher et al. (2016).

Binary encoding 

We describe fifteen different splice site sequence encoding schemes that have been used in earlier studies for mapping of splice site sequences into numeric feature vectors. These encoding schemes will also be helpful for transforming other nucleotide sequences into numeric forms, provided they are of equal length. These encoding schemes will help the computational biologist working in the field of classification (binary or multiclass) or prediction involving nucleic acid sequences of equal length.

Prabina Meher

EncDNA

Encoding of Nucleotide Sequences into Numeric Feature Vectors

Sparse.Feature function

Sequence dataset to be encoded into numeric vector containing 0 and 1, must be an object of class <code><a rd-options='' href='DNAStringSet'>DNAStringSet</a></code>.

Sparse.Feature: Nucleotide sequence encoding with 0 and 1.

Description

Usage

Arguments

Value

Details

References

Examples