Learn R Programming

RBERT (version 0.1.11)

BasicTokenizer: Construct objects of BasicTokenizer class.

Description

(I'm not sure that this object-based approach is best for R implementation, but for now just trying to reproduce python functionality.)

Usage

BasicTokenizer(do_lower_case = TRUE)

Arguments

do_lower_case

Logical; the value to give to the "do_lower_case" argument in the BasicTokenizer object.

Value

an object of class BasicTokenizer

Details

Has methods: `tokenize.BasicTokenizer()` `run_strip_accents.BasicTokenizer()` (internal use) `run_split_on_punc.BasicTokenizer()` (internal use) `tokenize_chinese_chars.BasicTokenizer()` (internal use) `is_chinese_char.BasicTokenizer()` (internal use) `clean_text.BasicTokenizer()` (internal use)

Examples

Run this code
# NOT RUN {
b_tokenizer <- BasicTokenizer(TRUE)
# }

Run the code above in your browser using DataLab