token_morph

token_words

token_nouns

A character vector or a list of character vectors to be tokenized into morphemes.
If <code>phrase</code> is a charactor vector, it can be of any length, and each element
will be tokenized separately. If <code>phrase</code> is a list of charactor vectors, each element
of the list should be a one-item vector.

phrase

Bool. If you want to remove punctuations in the phrase, set this as TRUE.

strip_punct

Bool. If you want to remove numbers in the phrase, set this as TRUE.

strip_numeric

These tokernizer functions perform tokenization into full or selected morphemes,
nouns.

An 'Rcpp' interface for Eunjeon project <http://eunjeon.blogspot.com/>.
The 'mecab-ko' and 'mecab-ko-dic' is based on a C++ library,
and part-of-speech tagging with them is useful when the spacing of source Korean text is not correct.
This package provides part-of-speech tagging and tokenization function for Korean text.

token_morph: Morpheme tokenizer based on mecab-ko

Description

Usage

Arguments

Value

Examples