Learn R Programming

biogram (version 1.0)

seq2ngrams: Extract N-Grams From Sequence

Description

Extracts vector of n-grams present in sequence(s).

Usage

seq2ngrams(seq, n, u, d = 0)

Arguments

seq
integer vector or matrix describing sequence(s).
n
integer size of n-gram.
u
unigrams (integer, numeric or character vector).
d
integer vector of distances between elements of n-gram (0 means consecutive elements). See Details.

Value

  • A character matrix of n-grams, where every row corresponds to a different sequence.

Details

A format of d vector is discussed in Details of count_ngrams.

Examples

Run this code
#trigrams from multiple sequences
seqs <- matrix(sample(1L:4, 600, replace = TRUE), ncol = 50)
seq2ngrams(seqs, 3, 1L:4)

Run the code above in your browser using DataLab