textshape (version 1.6.0)

split_sentence_token: Split Sentences & Tokens

Description

Split sentences and tokens.

Usage

split_sentence_token(x, ...)

# S3 method for default split_sentence_token(x, lower = TRUE, ...)

# S3 method for data.frame split_sentence_token(x, text.var = TRUE, lower = TRUE, ...)

Arguments

x

A data.frame or character vector with sentences.

lower

logical. If TRUE the words are converted to lower case.

text.var

The name of the text variable. If TRUE split_sentence_token tries to detect the column with sentences.

Ignored.

Value

Returns a list of vectors of sentences or a expanded data.frame with sentences split apart.

Examples

Run this code
# NOT RUN {
(x <- c(paste0(
    "Mr. Brown comes! He says hello. i give him coffee.  i will ",
    "go at 5 p. m. eastern time.  Or somewhere in between!go there"
),
paste0(
    "Marvin K. Mooney Will You Please Go Now!", "The time has come.",
    "The time has come. The time is now. Just go. Go. GO!",
    "I don't care how."
)))
split_sentence_token(x)

data(DATA)
split_sentence_token(DATA)

# }
# NOT RUN {
## Kevin S. Dias' sentence boundary disambiguation test set
data(golden_rules)
library(magrittr)

golden_rules %$%
    split_sentence_token(Text)
# }

Run the code above in your browser using DataCamp Workspace