quanteda (version 0.99.12)

toLower: Convert texts to lower (or upper) case

Description

Convert texts or tokens to lower (or upper) case

Usage

toLower(x, keep_acronyms = FALSE, ...)

# S3 method for character toLower(x, keep_acronyms = FALSE, ...)

# S3 method for NULL toLower(x, ...)

# S3 method for tokenizedTexts toLower(x, keep_acronyms = FALSE, ...)

# S3 method for tokens toLower(x, ...)

# S3 method for tokens toUpper(x, ...)

# S3 method for corpus toLower(x, ...)

toUpper(x, ...)

# S3 method for character toUpper(x, ...)

# S3 method for NULL toUpper(x, ...)

# S3 method for tokenizedTexts toUpper(x, ...)

# S3 method for corpus toUpper(x, ...)

Arguments

x

texts to be lower-cased (or upper-cased)

keep_acronyms

if TRUE, do not lowercase any all-uppercase words. Only applies to toLower.

...

additional arguments passed to stringi functions, (e.g. stri_trans_tolower), such as locale

Value

Texts tranformed into their lower- (or upper-)cased versions. If x is a character vector or a corpus, return a character vector. If x is a list of tokenized texts, then return a list of tokenized texts.

Examples

Run this code
# NOT RUN {
test1 <- c(text1 = "England and France are members of NATO and UNESCO", 
           text2 = "NASA sent a rocket into space.")
toLower(test1)
toLower(test1, keep_acronyms = TRUE)

test2 <- tokenize(test1, remove_punct=TRUE)
toLower(test2)
toLower(test2, keep_acronyms = TRUE)
# }
# NOT RUN {
test1 <- c(text1 = "England and France are members of NATO and UNESCO", 
           text2 = "NASA sent a rocket into space.")
toUpper(test1)

test2 <- tokenize(test1, remove_punct = TRUE)
toUpper(test2)
# }

Run the code above in your browser using DataCamp Workspace