Learn R Programming

sbo (version 0.5.0)

preprocess: Preprocess text corpus

Description

A simple text preprocessing utility.

Usage

preprocess(input, erase = "[^.?!:;'\\w\\s]", lower_case = TRUE)

Arguments

input

a character vector.

erase

a length one character vector. Regular expression matching parts of text to be erased from input. The default removes anything not alphanumeric, white space, apostrophes or punctuation characters (i.e. ".?!:;").

lower_case

a length one logical vector. If TRUE, puts everything to lower case.

Value

a character vector containing the processed output.

Examples

Run this code
# NOT RUN {
preprocess("Hi @ there! I'm using `sbo`.")
# }

Run the code above in your browser using DataLab