A character vector or a list of character vectors to be tokenized into morphemes.
If phrase is a charactor vector, it can be of any length, and each element
will be tokenized separately. If phrase is a list of charactor vectors, each element
of the list should be a one-item vector.
strip_punct
Bool. If you want to remove punctuations in the phrase, set this as TRUE.
strip_numeric
Bool. If you want to remove numbers in the phrase, set this as TRUE.
Value
A list of character vectors containing the tokens, with one element in the list.
# NOT RUN {txt <- # Some Korean sentence
token_morph(txt)
token_words(txt, strip_punct = FALSE)
token_nouns(txt, strip_numeric = TRUE)
# }# NOT RUN {# }