Learn R Programming

biogram (version 1.1)

create_ngrams: Get All Possible N-Grams

Description

Creates vector of all posible n_grams (for given n).

Usage

create_ngrams(n, u, possible_grams = NULL)

Arguments

n
integer size of n-gram.
u
integer, numeric or character vector of all possible unigrams.
possible_grams
number of possible n-grams. If not NULL n-grams do not contain information about position

Value

  • a character vector. Elements of n-gram are separated by dot.

Details

See Details section of count_ngrams for more information about n-grams naming convention. The possible information about distance must be added by hand (see examples).

Examples

Run this code
#bigrams for standard aminoacids
create_ngrams(2, 1L:20)
#bigrams for standard aminoacids with positions, 10 amino acid long sequence, so
#only 9 bigrams can be located in sequence
create_ngrams(2, 1L:20, 9)
#bigrams for DNA with positions, 10 nucleotide long sequence, distance 1, so only 8 bigrams
#in sequence
#paste0 adds information about distance at the end of n-gram
paste0(create_ngrams(2, 1L:4, 8), "_0")

Run the code above in your browser using DataLab