50% off | Unlimited Data & AI Learning

Last chance! 50% off unlimited learning

Sale ends in


rtext (version 0.1.21)

text_tokenize.rtext: function tokenizing rtext objects

Description

function tokenizing rtext objects

Usage

# S3 method for rtext
text_tokenize(string, regex = NULL,
  ignore.case = FALSE, fixed = FALSE, perl = FALSE,
  useBytes = FALSE, non_token = FALSE)

Arguments

string

text to be tokenized

regex

regex expressing where to cut see (see gregexpr)

ignore.case

whether or not reges should be case sensitive (see gregexpr)

fixed

whether or not regex should be interpreted as is or as regular expression (see gregexpr)

perl

whether or not Perl compatible regex should be used (see gregexpr)

useBytes

byte-by-byte matching of regex or character-by-character (see gregexpr)

non_token

should information for non-token, i.e. those patterns by which the text was splitted, be returned as well